Introduction

The world has warmed by about a degree since the beginning of the 20th century and attribution research points to anthropogenic drivers as the dominant cause1 with greenhouse gas emissions as the major contributor to the warming2. The role of human influence is also evident in regional warming since the 1950s3, which, in turn, engenders an intensification of extreme heat episodes4,5 that put a strain on human and, more widely, ecosystem health6,7. Concern about the changing characteristics of weather and climate extremes propelled the science of event attribution8,9. A profusion of event studies have provided strong evidence that hot extremes around the world are becoming more frequent and intense as a result of anthropogenic climate change10. The pioneering work of Stott et al. (2004) that set the foundations of this new research area11, was motivated by the deadly European summer heatwave of 2003 that claimed the lives of 35,000–70,000 people12. Since then, further studies have shown how anthropogenic forcings influence not only heatwaves, but a range of different types of extremes, as well as climate change impacts13. Continuing warming is expected to further exacerbate risks, exposing billions of people to climatic conditions never experienced over the last 6000 years14. Our study aims to offer new insights into the looming reality of temperatures that would not only break new records, but also, in the absence of effective adaptation, could usher in calamitous socio-economic consequences. Dangerous temperature thresholds in terms of local impacts are sometimes easy to identify. For example, after reaching a record temperature of 38.7 °C in Cambridge in July 2019, the possibility of exceeding 40 °C in the UK, which would have been unthinkable a generation ago, began to appear within reach. An attribution study indeed confirmed that the chances of such an event are increasing so fast because of human influence, that it could become common by the end of the century15. Only two years after the study was published, a new UK record was set at 40.3 °C16. Recent studies have also affirmed that Europe needs to brace itself for summers hotter than 200317,18. Extreme heat poses greater challenges to communities that are more vulnerable because of their lower adaptive capacity. In Latin American cities, for example, it is estimated that nearly a million of deaths during 2002–2015 are linked to temperature extremes19. Even in hot countries like India, when one might expect the population is well-adapted to high temperatures, extreme heat still takes its toll. For example, during a persistent hot spell in Gujarat in May 2010, excess mortality was found to have increased by 43%20.

The raft of adverse impacts associated with extreme heat makes it imperative to understand how human climatic influences change their characteristics. The strong links between extremely high temperatures, a major contributor to heat stress, and excess mortality have been well-documented21,22. Such impacts are exacerbated in cities because of the urban heat island effect6. The increasing frequency of extremes would also lead to a shift in diseases23. Moreover, high temperatures have been shown to be detrimental to mental health24 and associated with an increase in the risk of suicide25. Excessive heat stress in hot ambient conditions has also been shown to reduce productivity and put a strain on occupational health26. Nevertheless, a billion outdoor workers are estimated to be exposed to extreme heat, with a third of them suffering ill health as a result6. As the climate continues to change, hotter regions are projected to bear most of the burden of heat-related health impacts in coming decades27. While the UK public is concerned about crossings of the 40 °C threshold28, hotter countries face the challenge of overshooting the more perilous threshold of 50 °C more frequently, or even for the first time in coming years. Here we use exceedances of the 50 °C threshold to characterise extremely high and potentially life-threatening temperatures, which are projected to become more persistent and widespread over our reference region29, and which could affect up to half of its population by the end of the century30. It is estimated that under a global warming greater than 2 degrees, temperatures above 50 °C will be recorded in all continents except Antarctica, with maxima over the Arabic Peninsula31. Even in Europe the prospect of such extremes has become more palpable after the highest temperature of 48.8 °C was recorded in Sicily on 11 August 202132. On the other hand, the development of countries in the Arabian Gulf has already been factoring in summer temperatures that often rise above 50 °C33. In the absence of any mitigation measures, model projections suggest that in the second half of this century heatwaves of unprecedented severity would begin to emerge in North Africa and the Middle East, characterised by exceedances of the 50 °C threshold30. In this study we take a step further and aim to quantify the changing likelihood of such threshold exceedances in selected locations in the Mediterranean area and the Middle East. In addition to 50 °C, we also consider exceedances of 45 °C, as a more suitable threshold for cooler locations, where exceeding 50 °C remains unlikely even in future decades. Our analysis distinguishes between the roles of human influence and internal climate variability and maps out in a quantitative manner the growing reality of severe heat, from the pre-industrial climate to the present-day and on to the end of the century. To estimate future probabilities, we adopt a medium emissions scenario that, unlike the business-as-usual pathway, recognises global adaptation and mitigation efforts and therefore serves as a better starting point34,35.

As an attribution study, our work focusses on the meteorological hazard component of extreme heat risk. In addition, the extent of associated impacts would also be determined by changes in the population, as well as its exposure to the hazard and its degree of vulnerability36,37,38. Here we employ a risk-based attribution framework39,40,41 to estimate the probability of extreme temperatures in 12 locations over 3 continents around the Mediterranean Basin and the Middle East. Although all locations are characterised by hot summers, they cover a range of temperature climatologies, with some already experiencing temperatures close to or above 50 °C and some not. Simulated data from experiments with and without the effect of human influence on the climate yield the probability estimates of extreme events. The modelled data come from an ensemble of 11 state-of-the-art models that participated in the phase 6 of the Coupled Model Intercomparison Project (CMIP6)42 and are evaluated against station observations. Using data from the latest CMIP6 models we assess here for the first time how the climatological risk of high-impact extreme temperatures in the region changes under the influence of anthropogenic forcing. While keeping the focus on attribution, we also provide a complementary, albeit broad perspective on health impacts for the 12 locations, examining how anthropogenic warming might lead to longer periods of excess thermal deaths in the coming years. For this purpose, we use the High Risk Days (HRD) index, developed as a simple temperature-based metric for impact attribution studies, which estimates the additional number of days relative to a reference climate when high temperatures are expected to lead to enhanced heat-related mortality43. Results from our analysis are presented in the next section. We first illustrate how the warmest day of the year (tx01) in the reference locations has been influenced by anthropogenic forcings and is projected to continue to change. As our attribution assessment is largely model-based, we also evaluate the models against observations to examine whether they can reliably represent historical temperatures. Changes in the likelihood of exceeding extreme thresholds due to human influence are presented next, followed by the assessment on health impacts using the HRD index. Finally, in the last section of the paper we summarise key results and discuss the significance and implications of our attribution study.

Results

Local changes in tx01: observations, reanalysis, and CMIP6 data

The sparse coverage of observation stations that provide data necessary for model evaluation, impedes attribution investigations of local extremes over large areas, meaning that most studies either consider larger spatial scales or only a specific location44. Here we select 12 locations in hotspot regions, where it seems reasonable to question whether maximum temperatures might rise above the thresholds of 45 or even 50 °C with an increasing frequency in a warming climate. We include locations in 12 different countries of three continents (hereafter referred to using the country ID code), for which station daily temperature data45 are available over several recent decades (Table 1). Maximum daily temperatures (Tmax) above 50 °C have already been recorded at two locations (QAT and SAU), with five more stations bordering on the threshold (Table 1). Our analysis employs the annual maximum daily temperature (tx01), i.e., a measure of the warmest day of the year, to examine year-to-year changes in local extremes. To understand how the local tx01 characteristics take shape within the context of larger spatial scales, we also employ data from the 20th Century Reanalysis (20CR)46,47 and examine historical changes in tx01 over a wider region and a much longer period compared to station observations. Reanalysis temperature records and trends over the period 1871–2010 are illustrated in Fig. 1. While only few European regions see records above 45 °C, such extremes are clearly manifest in Northern Africa, the Arabian Peninsula and other parts of the Middle East. Our reference locations sample a range of tx01 maxima, and include sites with relatively moderate extremes (in TUR, ISR, and JOR), where the current likelihood of hitting 50 °C appears to be near-zero, given the much cooler local records. An interesting question we aim to answer is whether such unprecedented temperature exceedances may become plausible at these locations in coming years. Trends computed with 20CR data reveal significant long-term warming in tx01 over most of the wider region (Fig. 1b). The small cooling trends over parts of Eastern Europe and Western Asia are not statistically significant as confirmed with a Mann–Kendall test. The reanalysis suggests that tx01 is on the rise at all reference locations, with an estimated warming rate between 0.5 and 2.5 °C per century.

Table 1 Observation stations.
Fig. 1: Records and trends of the warmest day of the year in the 20CR reanalysis.
figure 1

a Map of the highest maximum daily temperature during 1871–2010. Darker colours correspond to temperatures higher that 45 °C. b Trends in the warmest day of the year (tx01) during the reanalysis period. Crosses mark areas where trends are significantly different from zero (tested at the 10% level), as determined by a Mann–Kendall test. When applying the Mann–Kendall test it is assumed that serial correlation has a minimal effect on the tx01 timeseries and is therefore not accounted for. The dots on the figure panels show the location of the 12 stations considered in the study.

Local tx01 estimates are also derived from CMIP6 simulations (Methods) that represent the natural climate, without the influence of anthropogenic forcings (NAT), and the actual world with all forcings acting on the climate system (ALL). We use 50 NAT and 46 ALL simulations from 11 different models that start in year 1850. The ALL simulations are extended to year 2100 following the “middle-of-the-road” emissions scenario SSP2-4.548. Local tx01 values correspond to the nearest model grid-box, an approximation that is found to work well for variables with large spatial correlation scales like temperature, away from densely built-up areas where urban warming also needs to be accounted for49. We therefore select only observation stations away from large urban centres, either in smaller towns or at airport sites outside the city boundaries, in order to minimise the effect of urbanisation. It should be stressed that our findings only apply to the specific locations considered here, and extreme event probabilities may indeed be higher in nearby cities. The approximation of local temperatures with grid-box data is also assessed in this study on the basis of the evaluation tests described later. A simple correction of the mean model bias relative to station observations is also applied to the simulated data (Methods). Timeseries of tx01 constructed with the ALL and NAT simulations and station observations are illustrated in Fig. 2. Although natural variability would mask to some extent the anthropogenic fingerprint, a warming trend has taken off in recent years in the ALL climate and is projected to intensify in future decades. Exceeding the 45 °C threshold is gradually becoming more common in cooler locations (e.g., TUR), but is already certain to be exceeded in hotter locations. In SAU, for example, average tx01 values are projected to reach 53 °C by the end of the century, rendering 50 °C a relatively low temperature. Despite regional variations in the likelihood of extremes, the CMIP6 ALL experiment simulates years with tx01 greater than 50 °C in all reference locations towards the end of the century, even those that are currently several degrees cooler and have not yet experienced such intense extremes.

Fig. 2: Observed and modelled tx01 timeseries at the 12 reference locations.
figure 2

Timeseries constructed with observational data are shown in black. Modelled timeseries from the 46 ALL and 50 NAT simulations provided by the CMIP6 models are shown in red and blue respectively. The grey horizontal lines mark the 45 and 50 °C thresholds. Location names are shown at the title of each panel.

Model evaluation

In common with other multi-model studies, we evaluate the ensemble of the 11 models as a whole15 against the station observations. While more detailed studies of extremes might evaluate various aspects of the simulated climate, like, for example, dynamical drivers, or land-atmosphere interactions, attribution analyses are primarily concerned with the ability of the models to reliably represent the distribution of the reference variable from which event probabilities are derived. We therefore employ some popular tests that have been developed and tried over years in event attribution studies15,50 to examine whether main statistical characteristics of tx01 constructed with the multi-model ensemble are overall consistent with the observations. Comparing the modelled and observed variability, we find that the observed standard deviation (SD), linked to the scale of the distribution, is within the range of the models (Supplementary Fig. 1), with no model yielding systematically high or low SD values across the reference locations. Power spectra that assess variability over different timescales51 also show that the observations lie within the range of the models (Supplementary Fig. 2). We next construct quantile-quantile (Q-Q) plots for each model simulation (Fig. 3), which broadly indicate the level of consistency between models and observations in representing different parts of the distribution (main body and tails). Again, we find no systematic discrepancy, but, on the contrary, a good level of agreement that is also the case for the warm tail of the distribution (higher tx01 values), pertinent to the representation of the likelihood of hot extremes. Finally, for each location, we apply a Kolmogorov-Smirnov test to compare observational tx01 estimates with a sample constructed by pooling together tx01 values from the 46 ALL simulations over the observational period. In all cases we find that the model-based distribution is indistinguishable from the observed one (p-value greater than 0.1). Despite sampling limitations due to the short observational records, the evaluation assessments could still flag models that markedly depart from the rest of the ensemble range and, in our case, they provide reassurance that the multi-model ensemble is indeed fit for the purposes of an attribution analysis. It is also evident that local temperatures can be adequately represented by grid-box scale data with no indication of significant urban warming to render the modelled temperatures too cold.

Fig. 3: Evaluation of the modelled tx01 distribution.
figure 3

Quantile–quantile (Q–Q) plots for each of the ALL simulations, comparing the modelled and the observed tx01 distributions. Location names are shown at the title of each panel.

The probability of exceeding 45 and 50 °C

We next use the multi-model ALL and NAT ensembles to estimate probabilities of extreme events, defined as tx01 exceedances of a high temperature threshold. Our main interest is in exceedances of 50 °C, but we also consider the 45 °C threshold that is more relevant for hot extremes in cooler locations, at least without strong anthropogenic influence. The natural world probabilities are estimated from all simulated years of the NAT experiment, while present-day and future probabilities are derived from time slices of the ALL simulations. Details of our analysis are given in the Methods. As common in event attribution studies, we report return times of extreme events (inverse probabilities), measuring the average span between consecutive extremes under certain climatic conditions. Figure 4 illustrates the change in the return time of years with tx01 above 45 °C (also Supplementary Table 2). In most locations the threshold is not particularly extreme and is exceeded every year, or every few years, even in the natural world. In the three cooler locations (ISR, JOR, TUR), however, such events are extremely rare in the NAT climate (return time of the order of centuries), but are becoming more common due to human influence and are estimated to occur at least once a decade by the end of the century. The changing return times for the 50 °C threshold are shown in Fig. 5 (also Supplementary Table 2). We find that reaching such extreme temperatures would be extremely rare or even impossible in the NAT climate. Even in the warmest SAU location, the estimated NAT return time is of the order of a century. Nevertheless, the likelihood is rapidly increasing due to anthropogenic forcings, with sub-century return times in today’s climate computed at five of the reference locations. By the end of the century, a tx01 greater than 50 °C may be seen at least once a decade in most locations and could be exceeded annually at the warmest sites of ARE, QAT and SAU, where warm extremes are projected to reach unprecedented highs in the mid- or high- 50 s (Fig. 2). Interestingly, even at the relatively cooler station in ESP, where exceeding 50 °C never occurs in the NAT climate and is extremely rare up to the 2050 s, the likelihood increases at a fast pace, leading to return times of less than a century by 2100 (best estimate: 70 years). Similarly, such extremes that would not occur in the pre-industrial climate at LBY, could become common in coming decades. Although the likelihood remains extremely low at the three cooler locations (ISR, JOR, TUR), there are years towards the end of the century with some simulated temperatures above 50 °C, implying that, although extremely rare, these exceedances are not impossible. A summary plot of the temporal changes due to human influence is shown in Fig. 6, where extremes are categorised according to their likelihood into four groups (from nearly impossible to common). In contrast to the lower threshold, where less change is seen, exceedances of 50 °C at most locations swiftly transit through categories of increasing likelihood. Finally, risk ratio estimates are reported in Table 2, measuring the present and future increase in the likelihood of extreme events relative to the NAT climate. For the more moderate threshold of 45 °C, we compute lower risk ratios that may even reduce to unity at locations where this temperature is commonly exceeded even without anthropogenic warming. At the three cooler locations, human influence is estimated to have increased the likelihood by a factor of about ten, but risk ratios become much higher by the end of the century. For the 50 °C threshold we find that the risk ratio goes to infinity at sites with a near-zero likelihood of extreme events in the NAT climate, and estimate tens or hundreds of times higher probabilities at hotter locations, again with further increases in the future. Risk ratios of infinity are often reported in attribution studies10, suggesting that events that would have been effectively impossible without human emissions have become possible. The range of risk ratios reflects differences in the local climate. For example, temperatures above 50 °C without human influence are more common in SAU than ARE (Fig. 2), hence the effect of anthropogenic warming is more prominent in ARE (higher risk ratio).

Fig. 4: Change in the probability of exceeding 45 °C.
figure 4

Return time (inverse probability) estimates are shown for the NAT climate (green), the present-day (pink), the middle of the century (red) and the end of the century (dark red). The black line connects the best estimates (50th percentile) corresponding to the different climatic conditions and the coloured vertical bars mark the 5–95% uncertainty range. Location names are shown at the title of each panel.

Fig. 5: Change in the probability of exceeding 50 °C.
figure 5

Return time (inverse probability) estimates are shown for the NAT climate (green), the present-day (pink), the middle of the century (red) and the end of the century (dark red). The black line connects the best estimates (50th percentile) corresponding to the different climatic conditions and the coloured vertical bars mark the 5–95% uncertainty range. The grey horizontal lines mark the 1000 years (solid), 100 years (dashed), and 10 years (dotted) return time values. Location names are shown at the title of each panel.

Fig. 6: Temporal changes in the likelihood of extremes under the influence of anthropogenic forcings.
figure 6

The likelihood (best estimate) of exceeding 45 °C (top) and 50 °C (bottom) in the natural climate, the present-day, the middle of the century, and the end of the century is grouped into four categories based on the return time. The horizontal rows correspond to the reference locations and each likelihood category is represented by a different colour.

Table 2 Risk ratio.

Increase in high risk days

As the health data required for a heat-mortality attribution analysis are not available for the countries considered here, we carry out a more general assessment of the expected increase in the severity and frequency of days with excess thermal deaths. In physiological terms, the “optimum temperature” (OT), or Minimum Mortality Temperature (MMT), describes the local comfort zone and sets the mortality baseline22,52. Excess deaths that occur when the temperature rises above, or falls below the OT are referred to as heat- and cold-related mortality. Non-optimum temperatures are estimated to account for almost 8% of all-cause mortality in selected locations22. Analysing mortality data in Japan and in 79 cities in three different continents with different climates, has demonstrated that the OT can be approximated by the 84th percentile of daily Tmax52,53. Even without explicitly accounting for the effect of humidity, no significant discrepancy was found between the OT and the 84th percentile in the cities examined. As the climate warms, the 84th percentile of Tmax also increases, and Fig. 7 illustrates the difference between its present-day and natural world values at the 12 locations considered in this study (Methods). This difference constitutes the “high risk warming” (HRW) index43 that measures the anthropogenic warming leading to an increase in thermal deaths, or the extra amount of warming that people need to adapt to relative to the NAT climate to counteract an increase in heat-related mortality. We find that the HRW index ranges between 1.6 and 2.3 °C at the reference locations. Moreover, the 84th percentile is much lower than 50 °C (and even below 40 °C at most locations), suggesting that exceedances of such a high threshold could bring about large mortality spikes as they drive the population well-outside its comfort zone, in the absence of mitigating social factors. We next estimate the local “high risk days” (HRD) index that measures the change in the number of days above the 84th percentile of Tmax relative to a baseline climate43 (Methods). The HRD index is essentially a crude estimate of the additional days when thermal deaths are expected to occur because of anthropogenic warming. Here we set the baseline climate to present-day to examine future increases in risk relative to today’s climate. Alternatively, the baseline could be set to the NAT climate, or any period relative to which one wants to assess the anthropogenic increase in the frequency of excess mortality43. Figure 8 shows timeseries of the HRD index from present-day to the end of the century. Our analysis suggests an index increase of about 1–2 months (best estimate) at all regions. This estimate suggests that the reference locations are set to experience several more weeks of oppressive heat in coming decades, though the actual number of excess deaths would also depend on other meteorological factors like humidity, as well as socio-economic aspects related to how well communities are prepared to adapt to new climatic conditions where extreme heat episodes become more frequent.

Fig. 7: High-risk warming.
figure 7

Estimates of the 84th percentile of Tmax at the 12 reference locations estimated with data from the CMIP6 experiments for the NAT climate (green) and the present-day climate (red). The increase in the 84th percentile that represents the warming of the pre-industrial OT temperature due to human influence is also marked on the plot for each location.

Fig. 8: Increase in high-risk days during the course of the century.
figure 8

Timeseries of the HRD index representing the additional days in a year with excess heat-related mortality relative to the present-day because of anthropogenic warming. The thick red line corresponds to the ensemble mean of the ALL simulations and the pink shaded area marks the 5–95% uncertainty range. The ensemble mean value at the end of the century is also marked on each panel. Location names are shown at the title of each panel.

Discussion

Understanding the changing characteristics of extreme events under the influence of anthropogenic climate change provides a solid scientific basis for the planning of effective adaptation and mitigation policies. Changes in the climate of the reference region54 with increases in extremely high temperatures over coming decades55 have been highlighted before in the literature. Our attribution analysis demonstrates how the frequency of maximum temperatures above 50 °C is rapidly increasing in parts of the Mediterranean and the Middle East. Such extreme temperatures could be considered commonplace by the end of the century at locations where they are still very rare and would not have occurred in the pre-industrial world. As the hottest European temperatures already verge on 50 °C, the question arises whether extreme climatic conditions similar to neighbouring regions like Northern Africa are migrating into Europe. Our work provides some evidence that this may indeed be the case, at least in areas with the largest warming trends like Southern Spain. Although this northward climate migration is perhaps an unsurprising scientific inference, its associated unprecedented impacts are of great importance to affected European countries. Even at the hottest locations examined here, where temperatures above 50 °C have already been observed, human influence is found to have increased their likelihood by a factor of ten or more. Of course, like all scientific studies, our analysis has its limitations. Our multi-model ensemble is tied to the available simulations from models that make unequal contributions to it. However, such an “ensemble of opportunity”, as often referred to, although not perfectly constructed, can still provide an indicative measure of the modelling uncertainty. Results are also very much dependent on the experimental design, e.g., the choice of the future emissions pathway, or the selected locations. The likelihood of extremes is indeed expected to vary considerably within a country due to regional climatic variations, urbanisation, etc, and the selected sites offer only a limited, yet instructive, perspective on the overall regional response to climate change. Our study considers only one specific region where local exceedances of 50 °C are shown to occur with an increasing frequency, but, given sufficient data availability, future work may extend it to other hotspot regions like South Asia, Central America, or Australia. On the side of impacts, we find a marked increase in days with excess mortality due to anthropogenic warming, though more focussed follow-up studies are necessary to establish how the combination of meteorological variables would influence lethal heat56, or assess the important roles of exposure, population growth and adaptive capacity6.

The increasing risk of population exposure to unprecedented temperatures has profound implications. It has already been pointed out that parts of the Persian Gulf and the wider Middle East are on the brink of experiencing heatstress too severe for human tolerance57,58, calling for urgent action to ensure high-risk areas remain inhabitable. There are also concerns that in Gulf countries with a booming building industry, construction workers are particularly vulnerable to extreme heat59,60, but associated heat illnesses and deaths may be underreported (https://www.bbc.co.uk/news/av/world-middle-east-61711468). Preventive measures in these countries ban outdoor work at certain hours during the hottest months, but still cannot guarantee workers will not be exposed to dangerously high temperatures. It is therefore essential that scientific information is not only communicated to stakeholders, but also integrated into the process of decision-making. Adaptation and public health planning, which has so far been largely reactive61, may call for changes in the built environment, efforts to raise public awareness, or improve healthcare readiness. Such steps become urgent in light of the ominous intensification of deadly heat and need to be taken in a way that would not widen the disparity between the poor and the rich. For example, air-conditioning is an effective resilience measure to heat mitigating up to 20% of heat-related mortality. While it is ubiquitous in wealthier countries like Qatar and has now begun to feature in outdoor spaces (https://www.washingtonpost.com/graphics/2019/world/climate-environment/climate-change-qatar-air-conditioning-outdoors/), it may still be considered a luxury among poorer countries, or the underprivileged, who are likely to bear the brunt of climate change impacts. Finally, alongside socio-economic adaptation, longer term mitigation action is also necessary to ensure communities will not be pushed beyond survival limits.

Methods

Computing tx01 with CMIP6 data

We use daily Tmax data from attribution experiments62 with a multi-model ensemble of 11 CMIP6 models to compute tx01 values for every simulated year. The tx01 is a popular index in climate change studies, describing annual events with more robust statistical properties than alternative extreme indicators of rarer events that lie into the far tail of the temperature distribution and are thus more poorly sampled. Details on the CMIP6 models can be found in the latest report of the Intergovernmental Panel on Climate Change (IPCC) by Working Group I63 and references therein. CMIP6 models have been used to investigate temperature changes in the reference region64,65,66 and shown to perform better than the previous generation of climate models67. Following a risk-based attribution framework, we compare simulations with (ALL) and without (NAT) human influence on the climate. The NAT experiment includes natural forcings only (changes in volcanic aerosols and in the solar irradiance), whereas the ALL experiment also includes the effect of anthropogenic forcings (changes in greenhouse gases, aerosols, ozone and land use). The NAT simulations span the period 1850–2020, while the ALL simulations, that also start in 1850, are extended to year 2100 with the “middle of the road” Shared Socioeconomic Pathway 2 4.5 (SSP2 4.5)48. We select models that provide both ALL and NAT simulations, and, for each experiment, we limit the number of simulations to ten per model to minimise biases towards models with larger ensembles. We also exclude two models that fail the evaluation tests against observations and have considerable data gaps. Given all the selection requirements, the ALL and NAT ensembles used in this study include 46 and 50 simulations respectively (Supplementary Table 1). Two main advantages of using multi-model ensembles are that a) they provide large samples for the estimation of small probabilities associated with extreme events which reduces the statistical uncertainty, and b) they account for model uncertainties as opposed to single-model analyses. Values of tx01 at the location of each observation station (Table 1) are computed for all simulated years using the daily Tmax data at the grid-point nearest to the station. Local temperatures may indeed be well approximated by modelled grid-box scale temperatures, as long as they are not heavily influenced by urban warming. For example, data from a CMIP6 multi-model ensemble similar to this study have been shown to represent well temperatures at the station of Kameoka in the suburbs of Kyoto, but require a correction to account for the urban warming that underlines recorded temperatures at Kyoto’s city centre49. Grid-point data have also served as a local proxy for different types of meteorological variables68. Finally, as in previous work15, a simple bias-correction was applied to the modelled estimates of tx01 to ensure they have the same mean over the observational period as the station data. The use of mean bias-correction, or, alternatively, of anomalies relative to a baseline period, is common in attribution studies and necessary to compare or combine different datasets. A bias estimate is derived for each model by subtracting the mean observed from the mean modelled tx01 over the observational period and is subsequently removed from all the tx01 values (both ALL and NAT) simulated by the same model.

Probability calculation

The likelihood of extreme events, defined as a tx01 exceedance of 45 or 50 °C, is estimated in the NAT and the present-day climate, as well as the future climate of the mid- and late 21st century under SSP2 4.5. The NAT climate is largely stationary, and we therefore use a sample of all tx01 values simulated by the NAT experiment, namely a total of 8550 values (50 simulations × 171 years in period 1850–2020). The large sample size has the advantage of reducing the sampling uncertainty in estimates of very small extreme probabilities. To account for non-stationarities in the warming ALL climate, we extract tx01 data from time-windows that represent the climate in different points in time15. More specifically, we utilise tx01 data in 30-year long periods centred on 2022 and 2050 for the present-day and mid-century climate respectively, and in the last 30 years of the extended ALL simulations for the late century climate. We thus obtain samples of 1380 values in total (46 simulations ×30 years) from which we estimate the likelihood of extreme events. Although the ALL samples are smaller than in the case of NAT, the associated increase in the sampling uncertainty is, at least partly, compensated by the fact that event probabilities become higher (and so less uncertain) in a warming world. We compute extreme event probabilities by fitting a Generalised Extreme Value (GEV) distribution to the simulated data9, a parametric fit commonly applied to estimate probabilities of block maxima extremes10. The fit is applied to the 8550 tx01 values of the NAT and 1380 values of the ALL experiment to quantify the NAT and ALL probabilities. The uncertainty in the probabilities comes from a Monte Carlo bootstrap procedure15 that resamples the tx01 data 1000 times and derives new probability, or return time (inverse probability), estimates from the resampled data. In our study we report the best estimate of the return time, corresponding to the 50th percentile, as well as the 5–95% uncertainty range. The 50th, 5th and 95th percentiles defining the best estimate and the associated uncertainty range are thus computed from samples of 1000 possible values of the return time from the bootstrap procedure. The risk ratio that quantifies the change in the likelihood of extreme events because of human influence, is computed simply by dividing the probability of extreme events in the ALL climate by the probability in the NAT climate. As we have multiple estimates of the ALL and NAT probabilities, we get multiple estimates of the risk ratio for the uncertainty calculation.

Heat-related mortality indicators

The change in the 84th percentile of Tmax relative to the NAT climate, defined as the HRW index43, is indicative of the additional level of warming that people need to adapt to, so that today’s warmer 84th percentile becomes the new OT. It should be noted that the HRW is not a direct measure of the additional deaths in the warmer climate, although higher HRW values are indicative of higher heat-related mortality. We compute the 84th percentile of the natural climate at each location from the Tmax data provided by the NAT simulations. For today’s climate, the percentile is calculated from the Tmax data provided by the ALL simulations in the time-window 2008–2037. In order to estimate the actual temperature that the 84th percentile corresponds to, the mean bias is removed for each model, using the same correction method applied to tx01.

The changing frequency of days when heat-related mortality is expected to occur is measured with the HRD index43. If Ni denotes the number of hot days in year i with Tmax above the 84th percentile of a baseline period, and Nbase the average number of hot days in baseline years, then

$${\rm{HRD}}={N}^{i}-{N}^{{base}}.$$
(1)

By definition, an average baseline year should have about 58 days (0.16 × 365) with Tmax higher than the 84th percentile (a proxy of the OT). The HRD index is a basic measure of the additional number of days that people need to adapt to because of the human-induced warming since the baseline period. Here we set the baseline to the present-day climate represented by years 2008–2037, for which we have already calculated the 84th percentile, as explained earlier. Previous work showed that different baselines can also be employed to represent different levels of adaptation, although the change of baseline does not markedly change future HRD estimates43. To calculate the HRD in future years and construct the timeseries in Fig. 8, we use simulated Tmax data in 30-year rolling windows and compute Ni for the central year i as the average annual number of days with Tmax above the baseline (present-day) OT. We so construct timeseries for each of the ALL simulations and plot the ensemble mean and the 5–95% uncertainty range derived from the ensemble spread. Note that as the HRD comes from the difference between two metrics that have the same mean bias, there is no need to apply a bias-correction in this case.