Alternative climate metrics to the Global Warming Potential are more suitable for assessing aviation non-CO2 effects

A growing body of research has highlighted the major contribution of aviation non-CO 2 emissions and effects to anthropogenic climate change. Regulation of these emissions, for example in the EU Emissions Trading System, requires the use of a climate metric. However, choosing a suitable climate metric is challenging due to the high uncertainties of aviation non-CO 2 climate impacts, their variability in atmospheric lifetimes and their dependence on emission location and altitude. Here we use AirClim to explore alternatives to the conventional Global Warming Potential (GWP) by analysing the neutrality, temporal stability, compatibility and simplicity of existing climate metrics and perform a trade-off. We ﬁ nd that using the temperature-based Average Temperature Response (ATR) or using an Ef ﬁ cacy-weighted GWP (EGWP) would enable a more accurate assessment of existing as well as future aircraft powered by novel aviation fuels


Check for updates
Liam Megill 1,2 , Kathrin Deck 2 & Volker Grewe 1,2 A growing body of research has highlighted the major contribution of aviation non-CO 2 emissions and effects to anthropogenic climate change.Regulation of these emissions, for example in the EU Emissions Trading System, requires the use of a climate metric.However, choosing a suitable climate metric is challenging due to the high uncertainties of aviation non-CO 2 climate impacts, their variability in atmospheric lifetimes and their dependence on emission location and altitude.Here we use AirClim to explore alternatives to the conventional Global Warming Potential (GWP) by analysing the neutrality, temporal stability, compatibility and simplicity of existing climate metrics and perform a trade-off.We find that using the temperature-based Average Temperature Response (ATR) or using an Efficacyweighted GWP (EGWP) would enable a more accurate assessment of existing as well as future aircraft powered by novel aviation fuels.
The aviation industry contributes to anthropogenic climate change through CO 2 and non-CO 2 emissions.Recent studies have underscored the significance of aviation non-CO 2 emissions, which are now thought to be responsible for around two-thirds of the total warming from aviation 1 .Of primary importance is the release of nitrogen oxides (NO x ) [2][3][4][5][6] , water vapour 7,8 and aerosols [9][10][11][12] , and also the formation of contrails [13][14][15][16][17][18] .The EU Parliament recently adopted legislation (Directive 2023/958 of May 10, 2023  amending Directive 2003/87/EC) that aims to revise the EU Emissions Trading System (ETS) for aviation, inter alia requiring the European Commission to include aviation non-CO 2 effects in a monitoring, reporting and verification (MRV) framework and, if deemed appropriate, expand the scope of the ETS to include aviation non-CO 2 effects by the end of 2027.Such implementation requires the use of a climate metric, which relates non-CO 2 emissions and effects to their consequences on the climate and/or on society [19][20][21][22] .To maintain compatibility with market-based or offsetting schemes such as the ETS, climate metrics are often used as exchange rates, expressing non-CO 2 emissions on a common scale with CO 2 emissions.This single-basket approach can simplify climate negotiations and the implementation of climate policies 22 .
However, establishing an adequate equivalence is not trivial and there is currently no consensus on which climate metric is most appropriate for aviation.In international climate policy, the most commonly used climate metric is the Global Warming Potential (GWP) 19,23 , although it has been heavily criticised, primarily due to its dependence on the time horizon 20,[24][25][26][27] .The choice of climate metric for aviation climate policy is further complicated because aviation non-CO 2 emissions and effects have highly varying atmospheric lifetimes and efficacies 28 , are dependent on the emission time, altitude and location 29,30 and their impacts on the climate have a high degree of uncertainty 1,31 .Furthermore, since each climate metric uses a different climate indicator (e.g., stratospheric-adjusted radiative forcing or global mean near-surface temperature change) 32 and calculation method, a climate metric can inherently and inadvertently place emphasis on certain aircraft design choices, emission species or effects.The choice of climate metric is thus an important consideration for all stakeholders to ensure that the implementation of climate policy results in the desired reduction of the aviation industry's impact on climate and on society 22,32 .
In this paper, we explore the applicability of existing, physical climate metrics to the aviation industry.Specifically, we analyse the compatibility to aircraft design and aviation policy, methodological simplicity, neutrality and stability of the following conventional climate metrics: Radiative Forcing (RF) and the Radiative Forcing Index (RFIrelative RF) 33,34 , GWP 19,23 , Global Temperature Change Potential (GTP) 35 , Integrated GTP (iGTP) 36 and Average Temperature Response (ATR) 37 .The performance of a recently proposed, unconventional method that relates the changes in emission rates of short-lived species to pulses of CO 2 , denoted GWP* [38][39][40] , is also evaluated.We further analyse derivatives of the GWP and GWP* that are weighted by the efficacy, which we denote the Efficacy-weighted Global Warming Potential (EGWP) and EGWP* respectively.We find that, compared to the dominant GWP, a more accurate assessment of existing as well as future aircraft powered by novel aviation fuels would be enabled through the introduction of the ATR or EGWP into climate policy.We recommend further research into the potential use of the EGWP and into new efficacy estimates for aviation non-CO 2 emissions.

Development of climate metric requirements
Climate metrics are used in both their absolute form and relative to CO 2 .In aviation, absolute climate metrics have two primary use cases: in trajectory optimisation, where aircraft are re-routed to avoid climate-sensitive regions 41,42 ; and in aircraft design, where the climate metric can be part of the design trade-off process 43 .Relative metrics are primarily used at a policy level, notably to calculate CO 2 -equivalent emissions and multipliers in single-basket emissions trading schemes such as the ETS 19,22 .
From these use cases, we identify the following main requirements for climate metrics used for aviation.Aviation climate metrics shall: 1. Neutrally represent the chosen climate indicator (REQ 1) 32,44 .Value judgements should be left to policymakers and should not be built into climate metrics.Therefore, a climate metric should not exhibit any inherent bias towards specific aircraft design changes.2. Be temporally stable (REQ 2) 45 .For aviation policy, it should be possible to use climate metrics to monitor how well the industry is performing through annual and quarterly reports.The results shown by a climate metric, and thus the emission offsetting cost, should not vary to the extent that policymakers cannot gauge the effectiveness of their policies, and airlines and other stakeholders cannot estimate their offsetting cost.3. Be compatible with existing climate policy (REQ 3) 45 .A new climate metric must still be able to perform the same functions in the current climate policy context.4. Be simple to understand and implement (REQ 4) 20,21,45 .Nonspecialists should be able to understand how a climate metric is calculated and be able to correctly interpret what its results show.
In the following, we analyse each requirement individually and recommend the best-suited climate metric based on the results.We also analyse the impact of the time horizon on the results.We perform our analyses using the climate-chemistry response model AirClim 46,47 , which provides yearly global mean radiative forcing and temperature change values from spatially resolved aviation scenarios for CO 2 , water vapour, contrails and NO x -induced changes in ozone (short and long-term) and methane.For the purposes of analysing climate metrics with time horizons in the order of years, the responses of other very short-lived species such as aerosols are expected and assumed to be qualitatively the same as for contrails.
Note that in this paper, we add an A to denote an absolute climate metric (i.e., AGWP, AEGWP, AGTP, iAGTP) and use rATR to denote the relative ATR (to CO 2 ).The GWP* and EGWP* do not have absolute and relative forms.Where necessary for clarification, we also use P-, F-and S-to denote climate metrics calculated using a pulse, fleet or total aviation industry emission scenario, respectively.

REQ 1: climate metric neutrality with respect to aviation emissions
To assess the neutrality of climate metrics for aircraft design, the peak and average total temperatures and climate metric values of potential future fleets are compared.A wide range of narrowbody fleets are generated using a Monte Carlo simulation of various high-level aircraft parameters, including the use of conventional as well as novel aviation fuels such as SAF and hydrogen.The fleets are analysed using the climate-chemistry response model AirClim 46,47 , as described in "Methods" (cf.Table 3).The neutrality of each climate metric is gauged by the frequency f of incorrect fleet pairsdefined here as when the signs of the differences in peak/average temperature (ΔT) and total climate metric value (CM) between any two fleets i and j do not match-compared to the total number of fleet pairs (cf. the method used by Grewe et al. 48): where N = 10,000 is the total number of fleets.Figure 1 shows the frequency of incorrect fleet pairs as a function of the time horizon H; example results for a single time horizon H = 100 are shown in Supplementary Fig. 1.
In general, the similarity shown in Fig. 1 between the results for all climate metrics and climate objectives suggest that the peak temperature is also a good indicator for the average temperature and vice versa over a wide range of time horizons.The endpoint climate metrics F-RF and F-AGTP show a clear dependence on the time horizon and hence shape of the temporal emission profile (temporal evolution of yearly emissions per species).This demonstrates that a single value of radiative forcing or temperature at a time in the future is not a good indicator of the peak or average temperature.Since the radiative forcing from a number of aviation non-CO 2 effects, such as the warming effect from the NO x -induced short-term increase in ozone (e.g., ref. 1), has dissipated at large time horizons, the F-RF in particular can have low or altogether no sensitivity to a number of aircraft and engine design changes, leading to a high rate of incorrect fleet pairs.The integrated climate metrics F-AGWP, F-AEGWP and F-iAGTP/F-ATR, in comparison, have a memory of these previous emissions and are less dependent on the temporal emission profile.
The F-AGWP and F-GWP* show largely linear responses for peak and average temperature, particularly for time horizons above 60 years, but in general have higher frequencies of incorrect fleet pairs than climate metrics based on temperature or using efficacy.The F-EGWP* has a similarly low dependence on the time horizon with a lower frequency of incorrect fleet pairs, demonstrating almost ideal behaviour in this context.Whilst the F-iAGTP/F-ATR has a clear minimum at 70 years for peak temperature and at 20, 50 and 100 years for the corresponding average temperatures, it generally has a frequency of incorrect fleet pairs of less than 2%.The F-AEGWP performs very similarly, surpassing 2% error frequency only for the 20-year average temperature, and has clear minima at slightly lower time horizons: 55, 15, 45 and 80 years, respectively.

REQ 2: temporal stability
The temporal stability of climate metrics is judged using CO 2 -eq trajectories for the full aviation industry.In this work, we use the CORSIA and FP2050 scenarios developed by Grewe et al. 49 as examples.The CORSIA scenario assumes business as usual, but that CO 2 emissions are offset beyond 2020; whereas the FP2050 scenario makes use of the Flightpath 2050 targets: 75% CO 2 and 90% NO x reduction by 2050. Figure 2 shows the CO 2 -eq emissions calculated for both scenarios using each climate metric with a 100-year time horizon.Two elements of the responses are highlighted here.
First, the total CO 2 -eq values calculated using the endpoint S-RFI and S-GTP climate metrics are very similar, although as the emission rate (rate of change of yearly emissions over time) reduces around the year 2020 in the FP2050 scenario the results begin to drift apart.The RFI and GTP can thus be seen to be stable for the analysed full aviation emissions.However, both climate metrics can struggle to show qualitatively the same response for pulse and constant emissions (see e.g., ref. 50, their Fig.3.3), depending on the chosen time horizon.
The S-GWP and S-iGTP/S-rATR show very similar responses; the S-EGWP also produces similar, albeit generally lower results.The similarity between the S-GWP and S-iGTP/S-rATR potentially allows for speciesdependent conversion factors and reduces the political capital required to switch from the standard GWP to either the iGTP or rATR in climate policy.This is a somewhat surprisingly result, since the bases for the climate metric calculations differ, affecting the contributions of individual species to the total CO 2 -eq: The GWP is RF-based, whereas the ATR is temperature-based.As a result, the GWP emphasises contrail cirrus, and the ATR the warming effect of NO x -induced ozone (see Supplementary Fig. 2).Nevertheless, for full aviation scenarios assuming Jet-A1 fuel, the differing contributions seem to balance out.It is, however, likely that the introduction of novel propulsion technologies and fuels, which change the emission indices relative to one another, will result in a divergence of the total CO 2 -eq emissions calculated by both metrics.The rATR would then likely more closely match the EGWP than the GWP.Further research could analyse the response from different models and emission inventories to check the validity of conversion factors, in particular for novel aviation fuels.
A second noticeable element is the rapid deviation of the S-GWP* and S-EGWP* from the response shown by all other climate metrics in both  scenarios.The GWP* calculation method uses an average of the previous 20 years of radiative forcing and is closely tied to the emission rate given by the scenario.Therefore, small changes in the emission rate, in the years 2020 and 2050 in the CORSIA scenario, result in large changes in the CO 2 -eq trajectory.A policymaker using the GWP* or EGWP* method to monitor CO 2 -eq emissions between 2030 and 2050 could incorrectly assume that the impact of aviation is reducing, when in actuality only the emission rate has decreased.This instability is particularly problematic for the FP2050 scenario.Whilst the values from all other climate metrics largely correspond to the reducing fuel use, the S-GWP* and S-EGWP* show negative CO 2 -eq emissions between 2050 and 2080.Whilst this behaviour is useful for representing the temperature using a cumulative integral, negative CO 2 -eq values could easily be misinterpreted as a sign that aviation is causing an active cooling.The magnitude of the negative CO 2 -eq values is also disproportionately large compared to the shallow peak shown in the temperature.

REQ 3: compatibility with existing climate policy
To be compatible with existing climate policy means that a climate metric can be used in current climate frameworks and methods.These have generally been established on the basis of the GWP, which has become the most commonly used climate metric.There is thus a natural bias towards climate metrics that behave in a similar manner to the GWP.For aviation, this functionally means that any alternative to the GWP must be able to (1) calculate the temporal trajectories of CO 2 -eq emissions; and (2) calculate single values for fleets and individual flights, the latter of which is necessary for the introduction of aviation non-CO 2 emissions into the ETS.The climate metrics RF, GWP, EGWP, GTP, iGTP and ATR are able to perform these functions.However, the GWP* and EGWP* struggle to provide a single value for an individual fleet or flight.
Rather than providing a single value for a given time horizon, the GWP* method provides a temporal trajectory for each emission species, as shown in Fig. 3 for a simple fleet temporal emission profile.If used as a climate metric, for example, to compare this fleet to another, there is no obvious point along the temporal trajectory to choose.Indeed, the choice of which point to use is itself a trade-off between different emission species.Therefore, whilst the GWP* method is useful for certain technical discussions, it should be seen as a model rather than a metric, as previously argued by Meinshausen and Nicholls 45 , and should not be viewed as a potential replacement for the GWP in aviation policy.

REQ 4: simplicity
The endpoint climate metrics RF and GTP are clearly the easiest to understand.It is straightforward to determine how these climate metrics behave for different time horizons, background emissions scenarios and fuel scenarios.Integrated climate metrics-GWP, EGWP, iGTP and ATR-are more complex and it can be difficult to ascertain the impacts of individual effects and species on the results.The least simple to understand and implement are the GWP* and EGWP*: Their behaviour can be puzzling even for simple temporal emission profiles and can show initially counterintuitive results, such as the negative emissions in Fig. 2.
In comparison to temperature-based climate metrics (GTP, iGTP, ATR), climate metrics based on radiative forcing (RF, GWP) are easier to implement since they do not need a full climate or carbon-cycle model.The EGWP requires efficacy values and is thus more complex.However, the demand on computational time depends on which model is used to calculate the climate metric values.The GWP* and EGWP* are RF-based climate metrics, but they do use the AGWP for CO 2 , which would need to be defined as a standard value or calculated for a given scenario.In addition, these metrics require the temporal emission profile twenty years prior to any value, since the method uses a 20-year running average.This could potentially complicate the implementation of the GWP* method.

Choice of climate metric and time horizon
An overview of the performance of all analysed climate metrics is shown in Table 1.It is clear that the choice of climate metric must be the result of a trade-off.Based on our analysis and definition of requirements, the ATR and EGWP can be seen to perform best.Here, we inspect in more detail the advantages and disadvantages of these metrics in comparison to the GWP and investigate their dependence on the time horizon.We note that the ATR and iAGTP differ only in the division of the time horizon; in their relative forms, the rATR and iGTP are identical.However, the ATR is chosen rather than the iAGTP because the division by the time horizon improves the stability of the absolute climate metric responses.
The ATR and EGWP perform similarly well in the pairwise fleet analysis (REQ 1) for the peak and average temperature climate indicators, as well as in analysis on temporal stability (REQ 2).The rATR 100 produces CO 2 -eq emissions that very closely match those calculated using the GWP 100 , potentially easing the introduction of the ATR in climate policy (REQ 3); introduction of the EGWP is even simpler since it is only a derivative of the GWP.Finally, although the concept of an average temperature change is simple to understand for non-specialists, it can be difficult to identify the impacts of specific effects on results.In comparison to the GWP, the EGWP and in particular the ATR as a temperature-based metric, include more climatic processes, but thus also more assumptions and uncertainties.Implementation of both climate metrics can thus be seen as complex (REQ 4).Further work is required to better understand the benefits and potential downsides of the EGWP in other contexts.Further research into best estimates of the efficacy would also be beneficial.
The dependence of the GWP, EGWP and ATR on the time horizon is shown in Fig. 4 for three different emission scenarios.Individual overviews for each of these metrics and the GTP are provided in Supplementary Figs.3-6.The inclusion of the efficacy in the EGWP is evident from the lower panel of the figure, affecting the relative importance of ozone and contrails in particular.The results of the RF-based EGWP now much more Fig. 3 | CO 2 -eq emissions calculated using the GWP* 100 method for an example fleet demonstrating the flow-based nature of the GWP*.The inset figure shows the temporal emission profile; the main figure the CO 2 -eq response for each species (colours).For the comparison of fleets in this study (F-GWP*), the peak total value is used, in this example occurring in the year 2050.Note that the total value is dominated by the contrail impact and that different species have their peaks at later times.Therefore, the choice of which value to use is itself a trade-off between different emission species.closely match those of the temperature-based ATR.Nevertheless, the total relative metric values and thus calculated CO 2 -eq emissions of the GWP, EGWP and ATR are very similar, especially for large time horizons.In general, the sensitivity of all three metrics to the time horizon, represented by the gradient of the relative metric values, decreases with increasing time horizon.Using a low time horizon, for example, 20 years, would require particular justification: Why was 20 years chosen rather than say 15, 25 or even 19 years?Instead, the responses suggest that larger time horizons are most suitable for integrated climate metrics, greater than around 70 years.This is particularly true for the ATR, which requires larger time horizons to properly account for the delay in the temperature response of the atmosphere.

Discussion
Our analyses demonstrate that the selection of a climate metric plays a crucial role in ensuring that implemented climate policies effectively reduce the aviation industry's impact on the climate.In the fleet pairing analysis, we illustrate that climate metrics can have inherent trade-offs and favour certain aircraft designs over others.These inherent biases are undesirable since value judgements should be left to policy decision-making and not embedded into climate metrics.
The choice of climate metric is always the result of a trade-off.Due to the historical dominance of the GWP, there is a natural bias towards climate metrics that behave in a similar manner.However, our research clearly suggests that there are derivatives and alternatives that outperform the GWP for aviation.We require that a suitable climate metric displays neutrality with respect to different emission species; exhibits temporal stability; is compatible with existing climate policy; and is simple to understand and implement.These requirements are in line with those stated by others 20,45 .Based on these requirements, we identify the Efficacy-weighted Global Warming Potential (EGWP) and Average Temperature Response (ATR) as the most appropriate climate metrics for aircraft design and aviation policy.Both metrics are stable and can monitor the impact of the aviation industry using CO 2 -equivalents effectively.They also do not favour specific emission species for both peak and average temperature climate indicators across a wide range of time horizons and emission scenarios.
Whilst the ATR as a temperature-based climate metric has the potential to include more climatic processes and be more relevant for temperature-based targets than the GWP, the larger number of assumptions and uncertainties must also be considered.The EGWP may, therefore, be a useful compromise for policymakers, in that it can more accurately represent the climate impact of aviation whilst still using the GWP methodology.Shown are the responses from the GWP (solid line), EGWP (dashed line) and rATR (dotted line), which in its relative form is equivalent to the iGTP.The top row shows the total metric value relative to CO 2 ; the bottom row the responses calculated for each species (colours) relative to the total.Three temporal emission profiles are used, for which the fuel usage profiles are shown in the inset plots: a pulse emission (P-) in (a, d); a fleet emission (F-) in (b, e) and a 1% increasing emission (I-) in (c, f).Each response is shown for the Shared Socioeconomic Pathway SSP2-4.5 with margins (shading) for scenarios SSP1 to SSP5, which are used as the background emissions scenarios in AirClim.Further research is recommended into the advantages and potential disadvantages of using the EGWP.If the ATR were to be chosen, it would benefit from the close match of the total CO 2 -eq emissions calculated by the S-rATR 100 and S-GWP 100 , despite the differences in contributions of individual species.The S-EGWP 100 , for its part, also produces similar results.However, it is likely that the total emissions calculated by the ATR and GWP will diverge with the introduction of novel propulsion technologies and fuels such as hydrogen since the relative contributions of the non-CO 2 emissions will change.
Determining an appropriate time horizon for both the EGWP and ATR remains a challenge.The time horizon is a trade-off between incorporating the long-term response to an emission and ensuring the predictability and accuracy of a future emission scenario.We find that integrated climate metrics generally require larger time horizons to account for the atmospheric radiative forcing and temperature adjustment.If a short time horizon is chosen, policymakers must provide sufficient justification for the choice.Alternatively, values for different time horizons could be provided together, as proposed in ref. 27, although this complicates the calculation of CO 2 -eq emissions, for example in the upcoming ETS revision.
The accuracy of our results could be improved by using real aircraft designs: Since design parameters are chosen randomly within a given range, some fleets may not be physically feasible.However, since we used the same method of randomly choosing parameters, any additional incorrect fleet pairings caused by this limitation are assumed to cancel out and not impact our conclusions.Similarly, given the wide range of potential aircraft designs analysed in this study, it is unlikely that the choice of climate model, Air-Clim, has influenced the results significantly.Verification with another climate model may enhance our understanding of the results.
Ultimately, the most suitable climate metric and corresponding time horizon must be determined by policymakers depending on the policy and climate objective, the emission scenario, and whether a relative or absolute climate metric is required.Based on a general set of requirements suitable for policymaking, our findings endorse the use of the ATR and EGWP with a time horizon greater than 70 years for aircraft design and aviation policy to assess the long-term climate impact of aviation.However, the choice of climate metric does not have to be contentious or controversial: As our analysis and the numerous previous studies have demonstrated, tools exist with which the performance of any climate metric can be analysed and potential shortcomings and pitfalls identified, such that these can be addressed in climate policy.

Climate metric calculation methods
The calculation methods for all climate metrics are given in Table 2.These methods require a time series of radiative forcing (RF) and resulting temperature change ΔT.The calculation methods of the EGWP, ATR/iGTP and GWP*/EGWP* are described in more detail below.In this research, we use the climate-chemistry response model AirClim 46,47 to calculate the RF and ΔT responses of individual aircraft fleets using data from the DLR WeCare project 51 , and of the full global fleet using scenarios developed by Grewe et al. 49 .These are described in more detail in subsequent sections.For ease of comparison, we use the Shared Socioeconomic Pathway SSP2-4.5 52 as the default background emissions scenario for our analyses, but vary between SSP1 to SSP5 in the multivariate fleet analysis.AirClim is an extension to the linear response model for CO 2 developed by Sausen and Schumann 53 and combines emission data with pre-calculated altitude-and latitudedependent data obtained from steady-state simulations with the E39/CA 54 climate-chemistry model (for ozone, methane, water vapour and contrails) and ECHAM4-CCMod 55 (for contrail cirrus).It was chosen for this research due to its low computational cost and flexibility.
EGWP-The Efficacy-weighted Global Warming Potential (EGWP) was developed as a derivative of the GWP.It aims to introduce the efficacy of non-CO 2 emissions into the GWP method, such that the results obtained by the GWP more closely match those of temperature-based climate metrics.The EGWP for a single species i is then the GWP of that species multiplied by its efficacy r i , taken from ref. 28 (their Table 1).We note that this calculation method is still quite uncertain, in particular for contrail cirrus 56 , although it is not expected to affect the results of this study.Another potential approach to calculate the EGWP would be to use the Effective Radiative Forcing (ERF) and corresponding efficacies r0 i .Further work is required to analyse the performance of these two climate metrics and to develop better estimates of the efficacies for aviation emissions.
ATR-The Average Temperature Response (ATR) was initially developed by Dallara et al. 37 specifically for aircraft design.Initially, it included a weighting function and used an infinite time horizon H.However, the infinite time horizon in particular made it inappropriate for global fuel scenarios, for example.Since its inception, therefore, the ATR has been repurposed and is now generally used as the average temperature change over a given time horizon, as shown in Table 2.The weighting function is also no longer used.The relative ATR is denoted rATR in this research for clarity.Note that this definition of the ATR is related to the iAGTP by: ATR H ¼ iAGTP H =H.
GWP*-Since emission rates are meaningless for NO x -induced aviation effects (O 3 , long-term CH 4 reduction and the Primary Mode Ozone (PMO) effect) and for contrails, the GWP* methodology must be adapted to use radiative forcing.This equivalent calculation is proposed in the initial development of the GWP* by Allen et al. 38 and is modified using the improvements suggested by Cain et al. 39 and Smith et al. 40 to obtain:   40 to improve consistency with the linear models used for climate metric calculations.In this research, we use s = 0.75 to be consistent with Smith et al. 40 .However, we note that this value was calculated for methane (CH 4 ) and thus may not be optimal for other aviation non-CO 2 emissions and effects.
The EGWP* is a climate metric developed as part of this research as a derivative of the GWP*.Similarly to the EGWP, it makes use of the efficacy r i , also taken from ref. 28, to more closely match the results obtained by temperature-based climate metrics.The GWP* methodology is adapted by replacing RF i with RF i × r i .
The GWP* and EGWP* differ from the other climate metrics considered in this study in that they are flow-based climate metrics: The GWP* method does not provide a single value over a specific time horizon.Instead, it provides a CO 2 -eq value as a function of time, as shown in Fig. 3 (main text).To estimate the impact of a fleet or flight, a certain point along the temporal trajectory must be chosen.It can be argued that for the analysis of the peak temperature, the peak CO 2 -eq value should be chosen.However, the time at which the peak occurs differs per species, and can also differ per fleet, thereby raising the question whether the climate metric values of each fleet are showing the same thing and are thus intercomparable.In the example shown in the Figure, the peak total CO 2 -eq value is dominated by the contrail impact-all other emissions have their peaks at a later time.However, since no other point could be identified as appropriate, in this research we use time of the peak total CO 2 -eq value for fleet comparisonsin Fig. 3 thus the values in 2050.

Development of fuel scenarios
This research is based on the CORSIA and Flightpath 2050 (FP2050) fuel scenarios developed by Grewe et al. 49 .Since time horizons of up to 100 years are analysed, the scenarios needed to be extended.For this research, they have been extended until the year 2200, assuming a 0.5% annual growth rate after the year 2100.The scenarios are developed to test climate metrics and have not been evaluated for reliability and accuracy.

Fleet pairing analysis
The fleets used in this research are theoretical and characterised with a set of input parameters in AirClim, chosen uniformly from ranges shown in Table 3.The parameter ranges are based on expected technological pathways developed by Grewe et al. 49 , within the Clean Sky 2 Technology Evaluator 57 and by the "Hydrogen-powered aviation" report by the Clean Hydrogen Joint Undertaking (2020, https://doi.org/10.2843/766989).The contrail distance modifier mentioned in the Table is a multiplier for the total cruise distance for which contrails form, which is an AirClim input.In this context, a factor below unity corresponds to aircraft flying further to avoid climate-sensitive regions and, therefore, contrail formation.As a result, the reduction in contrail distance is coupled with an increase in fuel burn, estimated from ref. 58 to be of the ratio −15%:1% contrail distance to fuel burn up to a contrail distance reduction of 60% (contrail distance modifier of 40%), which is approximately the end of the quasi-linear region of the Pareto fronts calculated.For fleets using fuels other than Jet-A1, the emissions parameters are further modified according to Table 4. Here, the contrail reduction is assumed to correspond to changes in the exhaust composition due to the use of different fuels.We note that the data in both tables are a simplification and that comprehensive data is not yet available for different fuel types.However, since our objective is to provide a wide range of potential future fleets, this simplification is deemed appropriate and should not affect the results of this research.
For each fleet, a constant production rate is assumed, expected to last 30 years.Production is assumed to begin after 2030, approximately on par with the expected introduction of the next generation of single-aisle aircraft and new fuels such as hydrogen according to the analyses of Grewe et al. 49 .The exact year of introduction of new fleets is, however, not relevant to the outcome of this study and is thus varied.Each aircraft is further assumed to have a lifetime of 35 years with no hull losses.A single-aisle aircraft about the size of the Airbus A320 is chosen for reference.For simplicity, the fuel use of this fleet is taken to be 40% of Category 4 of the DLR WeCare project 51 , characterised by aircraft with seat numbers between 152 and 201.A total of 10,000 fleets are simulated using AirClim.conditioned upon the prior conclusion of a licensing agreement with the DLR.Qualified researchers can request an agreement on reasonable request from the corresponding author.See the accompanying text for a more detailed description of the contrail distance modifier, and Table 4 for the impact of different fuels.The values are taken from the "Hydrogen-powered aviation" report (Clean Hydrogen Joint Undertaking, 2020, https://doi.org/10.2843/766989).Note that these values cannot currently be corroborated through other studies and are, therefore, only used to provide a wide spectrum of potential future fleets.

Fig. 2 |
Fig. 2 | Comparison of the climate metric responses for full aviation emission scenarios.Shown are the CO 2 -eq emissions calculated using each climate metric with a 100-year time horizon for the CORSIA (a) and FP2050 (b) scenarios.Each climate metric is represented by a different combination of colour, line style and marker for clarity.Also shown is the fuel use (red dashed line) and temperature (solid black line) response for each emission species calculated using AirClim for the CORSIA (c) and the FP2050 (d) scenarios.All values are calculated on a yearly basis -the markers for each species are differently spaced such that overlapping lines can more easily be identified.

Fig. 1 |
Fig. 1 | Comparison of the neutrality of different climate metrics to changes in aircraft design.Shown is the frequency of incorrect fleet pairs, corresponding to those where the signs of the peak/average temperature change and climate metric change do not match, as a function of the time horizon for the peak temperature (a) and 20-, 50-and 100-year average temperature (b-d) climate objectives.Each climate metric is represented by a different combination of colour, line style and marker.Values are available between 5 and 100 years with a time horizon step of 5 years; however, markers are shown every 10 years for clarity.

Fig. 4 |
Fig.4| Comparison of the GWP, EGWP and ATR responses for time horizons between 0 and 100 years.Shown are the responses from the GWP (solid line), EGWP (dashed line) and rATR (dotted line), which in its relative form is equivalent to the iGTP.The top row shows the total metric value relative to CO 2 ; the bottom row the responses calculated for each species (colours) relative to the total.Three

"
Generally stable" refers to the finding that the RF and GTP are stable for full aviation emission scenarios, but can struggle to show qualitatively the same response for pulse and constant emissions (see e.g., ref. 50, their Fig.3.3).

where E CO 2
Àwe ðtÞ are CO 2 -warming equivalent emissions as a function of time, ΔRF the change in radiative forcing over the previous Δt = 20 years, RF

Table 1 |
Overview of the performance of the analysed climate metrics with respect to each requirement

Table 2 |
1alculation methods for all climate metrics used in this research t0 is the start of an emission series and, therefore, also the year for which a climate metric is to be calculated; H is the time horizon, r the efficacy and i a single species.E CO 2 Àwe is explained in the calculation description of the GWP*.Note that these methods all require the time series of radiative forcing (RF) or temperature change ΔT at least from t 0 until t 0 + H, which requires the use of a separate climate model.The GWP* and EGWP* require the time series of RF at least from t 0 − 20 until t 0 + H. the running average of RF and AGWP HðCO 2 Þ the AGWP of a CO 2 pulse at a time horizon of H years.The above equation differs to the one used by Lee et al.1only by the multiplication by g(s), which was introduced in the same year by Smith et al.

Table 3 |
Ranges of fleet design parameters for the fleet pairing analysis simulations

Table 4 |
Assumed change of in-flight emissions and emissionrelated effects for Sustainable Aviation Fuel (SAF) and hydrogen