Machine learning–based observation-constrained projections reveal elevated global socioeconomic risks from wildfire

Reliable projections of wildfire and associated socioeconomic risks are crucial for the development of efficient and effective adaptation and mitigation strategies. The lack of or limited observational constraints for modeling outputs impairs the credibility of wildfire projections. Here, we present a machine learning framework to constrain the future fire carbon emissions simulated by 13 Earth system models from the Coupled Model Intercomparison Project phase 6 (CMIP6), using historical, observed joint states of fire-relevant variables. During the twenty-first century, the observation-constrained ensemble indicates a weaker increase in global fire carbon emissions but higher increase in global wildfire exposure in population, gross domestic production, and agricultural area, compared with the default ensemble. Such elevated socioeconomic risks are primarily caused by the compound regional enhancement of future wildfire activity and socioeconomic development in the western and central African countries, necessitating an emergent strategic preparedness to wildfires in these countries.

The work "Machine learning-based observation-constrained projections reveal elevated global socioeconomic risks to wildfire in the twenty-first century" presents a very promising method to forecast wildfire impacts and its associated risks. The authors propose a new methodological approach combining ESM and Machine Learning to introduce constrains about fire-related drivers in the modeling workflow. The results suggest a clear improvement when compared with "original" methods both in terms of overall magnitude of CO2 emission and its spatial pattern. The procedure includes not only a validation that sets a common ground for comparison but several means to illustrate the main differences and improvements. Overall, the conclusions are well aligned by the results, though additional clarifications and modifications are required. The methodology couples together ESMs and MLT in an effort to overcome some limitations of the current procedures. The main advantage is that MLT can embed non-linear feedbacks from wildfire drivers. The concept looks promising. The description of the methods is quite complete though further details need to be provided in order for the method to be fully reproduced. The ESM models and input data are well described. Likewise, the MLT and associated packages are also presented together with a summary of the overall workflow for validations and performance evaluation. In spite of the promising results that will surely contribute to the current state-of-the-art, there are some limitations that I believe require further investigation.
1) In lines [196][197] it is stated that "four ESMs… …simulate a reduced global fire carbon emission during the first half of the twenty-first century". This seems rather unrealistic and partially responsible for the relatively low increase in emissions towards the end of the century (2.5% according to the proposed method compared to the 6% in the original one). The issue is clearly related to the 2 NorESM models, pulling down fire-carbon emissions during 2040-2050. The predicted drop is equal in magnitude the increase from 2050 to 2100 while setting emissions below 2010's estimates. What would be the emission estimates disregarding these ESMs? Moreover, the sudden decay between 2040 and 2050 seems to be what drives risk trajectories (Figure 3a and 3g), which would otherwise be more similar.
2) Another major concerns has to do with the use of MLT. More specifically to the possibility the limited capability of MLT (especially Random Forest) to extrapolate outside the range of observations is hindering the projections.
3) In the same line, how does the model respond to a more conservative scenario? What's the difference in magnitude compared to an SSP2-4.5, for instance? My main concern here has to do with the sensitivity of the proposed method to properly address different scenarios. Please, can you ellaborate on this? 4) Three MLT with very different parameterization were used. While random forest y relatively easy to tune up, support vector machines are quite challenging to optimize. Further insights into model calibration are required. Since caret is involved I assume parameter grids were fed to optimize parameter selection. I understand this is not a "manual" on how to calibrate ML models but the very basics must be provided. E.g., number of trees for RF, kernel types in SVM, learning rates. Where these parameters set constant over time?
Moreover, how MLT were combined? I mean, how wre predictions from the three models integrated? Is there any kind of "model averaging" involved?
What are these importance metrics used in each MLT? I am not sure the comparison between different metrics is reliable even in relative terms. There are examples of performace assessment through jacknife estimators using a common performance metric from test samples fitting univariate or models without a candidate predictor to assess the sensitivy of the model to that predictor. See for instance: -In my opinion it is critical that an expert on EC be included among the reviewers. I am not an expert on emergent constraints, but it is not clear to me that the authors have avoided the pitfalls associated with this approach. Some of these are outlined in , Williamson et al. 2021. o Strength of the statistical relationship. As far as I can tell Supp Fig 2 is the only place that the actual performance of the model is displayed. While it is a significant improvement on the 'default' EC option there are still large areas of fire-prone land with no significant correlation. This includes quite large parts of Australia, South America, Africa, North America and Europe, many of which are discussed at length in the manuscript. Can the authors clarify the amount of flammable land/burned area/emissions that their model is reliable for? Is it valid to apply the model in areas with no significant relationship? And does it matter that the model reproduces mean global emissions accurately in spite of these large areas without correlation? Could errors be cancelling each other out? Given the importance of this step, it is not clear to me that the authors have established the relationship significantly. They could perhaps provide some information to evaluate the machine learning approaches they have taken, either individually or in aggregate. No doubt an expert on EC could shed light on many of these issues. o Insufficient sample size -is the use of individual pixels enough to overcome this? o Model independence -multiple models share the same land use and fire parametrisations o Overlooking the potential for tipping points -these are widely recognised to be a risk for some fire regimes  o The mechanisms underpinning the relationship. I applaud the authors for including multiple relevant variables, but the use of machine learning largely obscures any mechanisms at play (the variable importance plots in the Supp Info notwithstanding). The authors describe this as processbased but I'm not sure this qualifies, as there is no explicit process. The authors do not provide any information on how they selected those variables (L518-520), or whether a different set might have performed better. Given the centrality of fire to this study, it would be good to see this addressed. o Quality of observations. It is worth acknowledging limitations in this data. Wind speed in particular suffers from a lack of long term high quality observational datasets against which to judge the accuracy of reanalysis values. Relative humidity, flash rate and soil moisture are probably also a fair way behind variables like temperature and rainfall in terms of their accurate simulation. See L409 o Can you clarify that you have avoided the possibility of strong but "overconfident" constraints  o Can you give an indication of reasonable performance of EC and its basis? Presumably the process by definition improves accuracy, so what is the best way to judge its effectiveness? -There are a number of simple errors in language and referencing which can easily be addressed, but somewhat undermine confidence in the rest of the manuscript e.g. o It is risks from / posed by wildfire, not risks to wildfire o Some references are listed twice o L91 bracket in wrong spot o Abstract: Forkel et al. and Li et al. are good references but do not go to the claim that it is a lack of observational constraints that limit wildfire projection credibility. It is arguably a lack of processbased understanding that limits confidence on ESM wildfire simulation   Fig 1, Supp Fig 6 -don't just show model and observations, but show the difference between the two -It is important to acknowledge that SSP-5 is not business as usual, but a (fortunately) less likely higher emissions scenario (Hausfather & Peters 2020). Ideally the authors would include a more realistic scenario as well, but at the least they could acknowledge this. -It's great that the authors include information about the fire model within ESMs in Supp Table 1, but these descriptions don't mean much to me or I suspect most readers. Can the authors expand this to a short para, or in some other way clarify the general nature of these fire modules and how they differ? -The country based approach is definitely valuable (Fig 3) as is the Congo and Amazon zoom in (Supp Fig 9). However, I think you will find better relationships and more meaningful drivers if you focus on dominant fire types e.g. forest fire, grass fire, peats etc.

Review of NCOMMS-21-34508-T "Machine learning-based observationconstrained projections reveal elevated global socioeconomic risks to wildfire in the twenty-first century" by Yu et al.
This manuscript considers the use of machine learning techniques (MLTs) to better represent the nonlinear relationships between fire occurrence and various environmental factors at global scales, and to then use this to predict carbon release and socioeconomic impacts. Fire has traditionally been poorly represented in Earth system/global climate models and so studies like this one, which aim to improve the representation of fire are vitally important in providing and accurate picture of global change. The use of MLTs to better express the nonlinear relationships between interacting aspects of complex systems had success in other contexts, and so while their use is not new, the application to global fire occurrence and carbon release is particularly innovative. As such, the manuscript addresses a significant issue, and I am sure it will attract considerable scientific interest.
While my overall opinion is that the manuscript is worthy of publication, I feel that the manuscript could be strengthened by addressing a few issues that are not adequately addressed in the current version. These are as follows: 1. The authors mention the 2019-20 Australian fires, which are notable for the massive amount of carbon they released. Much of this occurred in connection with numerous episodes of extremely intense fire behaviour. These episodes have been linked to particular environmental factors, primarily rugged terrain, forest fuels and critically low fuel moisture content, and their interaction (e.g., . While it could be argued that land-use accounts for forest fuels and that fuel moisture content is somewhat covered by temperature and relative humidity, the list of variables considered by the authors (Extended Data 3. Fuel moisture, which depends on air temperature and relative humidity, can have a significant effect on wildfire behaviour. For example,  note that when fuel moisture content drops below certain thresholds, wildfire behaviour can be noticeably different. This is due to greater propensity for certain types of fire behaviour, such as spotting. My concern in this respect, is that these thresholds are not incorporated in the approach used by the authors, and so their analyses may be missing important dynamics that could occur more often in the future as conditions become warmer. In particular, the methods used may not account for more intense fire behaviour, which could release substantially more carbon. The authors should note this as a limitation of their study and revise their conclusions accordingly.
4. The release of carbon from global fire constitutes an important climate feedback. Was this accounted for in the study? That is, were the carbon release estimates found in the study used to further inform the climatic projections used in a coupled manner? Again, if not, this should be noted as a limitation of the study.
I think that once these issues have been addressed the manuscript will be suitable for publication. However, I think doing so might constitute a major revision of the manuscript. 1

Reviewers' comments:
Reviewer #1 (Remarks to the Author): The work "Machine learning-based observation-constrained projections reveal elevated global socioeconomic risks to wildfire in the twenty-first century" presents a very promising method to forecast wildfire impacts and its associated risks. The authors propose a new methodological approach combining ESM and Machine Learning to introduce constrains about fire-related drivers in the modeling workflow.
The results suggest a clear improvement when compared with "original" methods both in terms of overall magnitude of CO2 emission and its spatial pattern. The procedure includes not only a validation that sets a common ground for comparison but several means to illustrate the main differences and improvements.
Overall, the conclusions are well aligned by the results, though additional clarifications and modifications are required. The methodology couples together ESMs and MLT in an effort to overcome some limitations of the current procedures. The main advantage is that MLT can embed non-linear feedbacks from wildfire drivers. The concept looks promising. The description of the methods is quite complete though further details need to be provided in order for the method to be fully reproduced. The ESM models and input data are well described. Likewise, the MLT and associated packages are also presented together with a summary of the overall workflow for validations and performance evaluation. In spite of the promising results that will surely contribute to the current state-of-the-art, there are some limitations that I believe require further investigation.
Response: Thank you very much for your encouragement and suggestions. We hope our additional analysis strengthens this manuscript.
1) In lines 196-197 it is stated that "four ESMs… …simulate a reduced global fire carbon emission during the first half of the twenty-first century". This seems rather unrealistic and partially responsible for the relatively low increase in emissions towards the end of the century (2.5% according to the proposed method compared to the 6% in the original one). The issue is clearly related to the 2 NorESM models, pulling down fire-carbon emissions during 2040-2050. The predicted drop is equal in magnitude the increase from 2050 to 2100 while setting emissions below 2010"s estimates. What would be the emission estimates disregarding these ESMs? Moreover, the sudden decay between 2040 and 2050 seems to be what drives risk trajectories ( Figure 3a and 3g), which would otherwise be more similar.
Response: Thank you for raising this issue. In the revised manuscript, we include an additional predictor, orography, as suggested by another reviewer. We also include an additional observational data set of soil moisture. These changes lead to the updated results on global fire carbon emission prediction ( Fig. 2) and associated socioeconomic risks (Fig. 3). As you can see, in the updated projection, the influence of the two NorESM models partially diminishes. To quantify the sensitivity of observation-constrained projection to the NorESM models, we repeat the analysis with a subset of the CMIP6 ESMs that exclude the two NorESM models (Extended Data Fig. 4 (Fig. 2a). An exclusion of the NorESM2 models leads to a slightly elevated future increase in fire carbon emissions produced by the observational constraint, especially over the northern extratropical land surface (Extended Data Fig. 4)." (page 8 lines 201-208) 2) Another major concern has to do with the use of MLT. More specifically to the possibility the limited capability of MLT (especially Random Forest) to extrapolate outside the range of observations is hindering the projections.

Response:
We agree that extrapolation remains one of the key uncertainties imbedded in the implementation of MLT. Our analytical framework uses CMIP6 historical results as input and CMIP6 future simulations as output to train three selected MLT, and then feeds observational data into the trained MLT (Methods and Extended Data Fig. 3). Therefore, potential extrapolation occurs when the observational data space of predictors is not fully covered by the multimodel simulation of historical data space for these predictors. To demonstrate the data space coverage of observations and multimodel simulations, we add scatterplots of observational and simulated predictors in Extended Data Fig. 14. The observational data space is mostly covered by the multimodel historical simulation, thereby ensuring minimal extrapolation uncertainty involved in the MLT. We include a brief description on the uncertainty source in the revised manuscript: "The reliability of MLT is degraded when the actual observational data space is insufficiently covered by the training (historical CMIP6 simulation) data space, namely the extrapolation uncertainty. Here, we further evaluate the data space of both observation and historical simulation of the climate and fire variables (Extended Data Fig. 14), and we find all these assessed variables are largely overlapped, indicating minimal extrapolation error involved in the current MLT application." (pages 20-21 lines 557-562) 3) In the same line, how does the model respond to a more conservative scenario? What's the difference in magnitude compared to an SSP2-4.5, for instance? My main concern here has to do with the sensitivity of the proposed method to properly address different scenarios. Please, can you ellaborate on this?
Response: Thank you for this helpful comment. To demonstrate the applicability of the current approach addressing different scenarios, we apply our framework onto nine CMIP6 ESMs that provide SSP2-45 simulations (Extended Data Fig. 12) and compare the results with the analysis conducted for SSP5-85 with the same set of models (Extended Data Fig. 13). The results and implications are discussed in the revised manuscript: "The projected fire regimes and their socioeconomic risks depend on the projected socioeconomic pathway. The currently examined SSP5-85 reflects a high-emission scenario 58 , whereas a lower-emission scenario, SSP2-45, suggests a generally milder increase in global fire carbon emission, for both the original and observation-constrained ensembles (Extended Data Fig 12). In the northern subtropical and mid-to-high latitudes, while the default ensemble indicates a spatially homogeneous but slightly weaker increase in fire carbon emission in SSP2-45, compared with that estimated for SSP5-85 from the same set of ESMs (Extended Data Fig. 13 4) Three MLT with very different parameterization were used. While random forest y relatively easy to tune up, support vector machines are quite challenging to optimize. Further insights into model calibration are required. Since caret is involved I assume parameter grids were fed to optimize parameter selection. I understand this is not a "manual" on how to calibrate ML models but the very basics must be provided. E.g., number of trees for RF, kernel types in SVM, learning rates. Where these parameters set constant over time?
Response: Thank you for this helpful suggestion. The parameter optimization is now briefly discussed in the Methods section: "The prediction model is fitted for each MLT using the training data set that targets each future decade, with parameters optimized for the minimum RMSE via 10-fold cross-validation-in other words, using a randomly chosen nine-tenth of the entire spatial sample (n = 10,193) for model fitting and the remaining one-tenth of the entire spatial sample (n = 1,132) for validation, and repeating the process 10 times. For svmRadialCost, the optimal pair of cost parameter (C) and kernel parameter sigma (sigma) is searched from 30 (tuneLength = 30) C candidates and their individually associated optimal sigma. For gbm, we set the complexity of trees (interaction.depth) to 3, and learning rate (shrinkage) to 0.2, and let the -train‖ function search for the optimal number of trees from 10 to 200 with an increment of 5 (10, 15, 20, …, 200). For rf, the number of variables available for splitting at each tree node (mtry) is allowed to search between 5 and 50 with an increment of 1 (5, 6, 7, …, 50); the number of trees is determined by the algorithm provided by randomForest package and the -train‖ function by the caret package." (page 21 lines 570-582) Moreover, how MLT were combined? I mean, how were predictions from the three models integrated? Is there any kind of "model averaging" involved? Minor concerns Abstract. References are not common in an abstract and I don"t think the ones included are critical. They are cited right away in the Introduction.
Response: Thank you; references were removed from the Abstract.
L53. Consider replacing "global fire regimes" with "fire regimes across the globe" or something similar. There is not actual global fire regimes but a collection of fire regimes and pyroregions.
Response: Thank you; we have changed "global fire regimes" to "fire regimes across the globe" throughout the manuscript.
L82. "reducing uncertainties". I would say that also spatial inaccuracies judging by the outputs presented.
One of the most striking improvements come from the improved capability of the method to properly capture the spatial patterns. See Fig. 1 for example.
Response: Thank you; the relevant sentence has been changed to -…we hypothesize that constraining the ESM wildfire estimates by observations is a potentially valid approach for reducing spatial inaccuracies in global wildfire projections and related socioeconomic risks." (page 3 lines 74-76) L151-152. About the model performance. Some insights into the bias (over vs underestimation) and it's spatial footprint would enrich the findings.
Response: That"s correct. In the revised manuscript, we add a brief discussion on the model performance, bias and its spatial footprint: "The overestimated historical and future enhancement of fire carbon emissions simulated by the default ESMs is mainly distributed across the historically sparsely vegetated regions (Extended Data Fig. 4), potentially as a result of unrealistic representation of dynamic vegetation processes 45 . Furthermore, these ESMs display consistently strong linkage between the simulated historical and future fire carbon emissions (Extended Data Fig. 9), suggesting the primary need to improve historical wildfire simulation for better prediction of future wildfire evolution." (pages 14-15 lines 370-376) L155. "more optimal" with "better" Response: "More optical" has been changed to "better" in the revised manuscript. (page 6 line 160) L177-178. Was this calculated at pixel level or are R2 calculated for each ESM? If it is the first, acknowledging the regions with weaker performances will improve the understanding of the outputs.

Response:
The spatial R 2 and correlation were calculated for each ESM and their multimodel mean, using all pixels on the globe. This sentence has been changed to "e. Squared spatial correlation (R 2 Fig. 14), and we find all these assessed variables are largely overlapped, indicating minimal extrapolation error involved in the current MLT application." (pages 20-21 lines 551-562) L545. Rather increased observational data improving the spatial accuracy of the interpolation.

Response:
There was an error in the caption of old Extended Data Fig. 7 (new Extended Data Fig. 8). The observational data sets are all at 0.25° resolution, regardless of the input model data resolution. This error was corrected: "Because the MLT are trained using the global spatial sample, we expect the performance of MLT to be sensitive to the spatial resolution of the training data set. This assumption is tested by varying the interpolation grids (1°, 2.5°, 5°, and 10° latitude by longitude) of the ESMs and fitting MLT using this specific-resolution training data for the validation period (Extended Data Fig. 7). Observational data sets at 0.25° resolution are subsequently fed into the fitted MLT models, regardless of the input model data resolution. This sensitive test sheds light on the importance of spatial resolution to our observational constraining and thereby implies potential accuracy improvement of our MLT-based observation constraint with the development of higher-resolution ESMs." (page 22 lines 601-609). Thus, this exercise suggests that increased model resolution tends to improve the spatial accuracy of the observational constraint.

L561. Interpolated or resampled?
Response: Thank you for the correction. In this analysis, we sum up the finer-resolution population, GDP, and agricultural area within 0.25° pixels. We believe "resampled" is the correct word. (page 23 line 623) L588. I think average or median values plus standard deviation or IQR are more appropriate.

Response:
We have elaborated the reason for choosing the highest importance score among the annual mean and 12 monthly mean values to represent the overall importance of a particular variable: "For the atmospheric and terrestrial variables that include annual mean and monthly climatology as predictors, to account for the overall importance of a particular variable while considering the possible information overlapping contained in each month and annual mean, the importance of each variable is represented by the highest importance score among these 13 predictors (annual mean, January, February, …, December)." (page 24 lines 654-659)

Yu et al NCC emergent constraints method for wildfire
The authors perform a variation of an emergent constraints analysis on earth system model projections of wildfire emissions under climate change. They use multiple fire relevant variables in combination with several machine learning techniques to constrain current and future estimates of the emissions from wildfire around the globe. They also estimate the exposure of population, GDP and agriculture to changes in emissions. Their technique does indeed adjust model outputs closer to observations, and results in a narrower, and generally lower estimate of the trajectory of future wildfire emissions, albeit one that is still rising steadily throughout the 21st century. Congratulations to the authors on an interesting and potentially useful study on an important topic. Improving projections of global wildfire activity is a very worthy task and the paper is generally well organised with good figures. Applying the emergent constraints paradigm to wildfire modelling is novel, as is the use of multiple variables and a machine learning approach. I have some high level concerns, as well as some more specific ones, which I detail below. In my opinion it is critical that an expert on EC be included among the reviewers. I am not an expert on emergent constraints, but it is not clear to me that the authors have avoided the pitfalls associated with this approach. Some of these are Response: Thank you for your positive feedback, valuable comments, and insightful references. In the revised manuscript, we have explicitly discussed (1) the EC pitfalls inducing limited reliability of traditional EC in the projection of fire carbon emissions, 2) the advances of our MLT-based EC framework over the traditional EC, and (3) the remaining uncertainties contained in most EC studies and incompletely addressed in this manuscript.
We have outlined the pitfalls associated with traditional EC that lead to the limited reliability of EC in the projection of fire carbon emissions: "However, the traditional EC framework appears less reliable in the projection of fire carbon emissions (Extended Data Fig. 2 The performance of the MLT-based EC has been elaborated in the revised manuscript in terms of the MLT algorithms, or in-sample statistical strength: "The prediction model is fitted for each MLT using the training data set that targets each future decade, with parameters optimized for the minimum RMSE via 10-fold cross-validation-in other words, using a randomly chosen nine-tenth of the entire spatial sample (n = 10,193) for model fitting and the remaining one-tenth of the entire spatial sample (n = 1,132) for validation, and repeating the process 10 times…. The cross-validation R 2 s exceed 0.8 (n = 1,132) for all optimized MLT and all future periods." (page 21 lines 570-583). Since we used the global spatial sample to train the MLTs, it is not straightforward to identify regions where in-sample statistical relationship is strong or weak.
In terms of actual performance of MLT-based EC, or out-of-sample statistical strength, Fig. 1 and associated text demonstrates the accuracy of this approach using multimodel ensemble and individual models during the validation period. Specifically, Fig. 1c shows the spatial distribution of statistical strength of the current approach, as described in the revised manuscript: "The observation-constrained product substantially reduces the overestimation of fire carbon emissions over sparsely vegetated regions (mainly for EC-Earth3 models, Extended Data Fig. 1), tropical rainforests (mainly for EC-Earth3 models), northern boreal regions (mainly for MRI-ESM2.0), and densely populated regions in North America and Europe (mainly for CNRM-ESM2.1 and MPI-ESM1.2-LR), as well as the underestimation of fire carbon emissions over the savannahs in Africa from most analyzed ESMs (Fig. 1a-c). Relatively large error between the observation-constrained and observed fire carbon emissions remains in the present fire-prone regions (e.g., tropical and subtropical Africa, subtropical South America, and southeast Asia) (Fig. 1c). o Overlooking the potential for tipping pointsthese are widely recognised to be a risk for some fire regimes  Response: Thank you for pointing this out. We note this as a limitation of our study: "Finally, the current MLT-based observation-constraint framework does not directly account for potential tipping points in fire regime evolution 59,65 or certain threshold in fuel moisture content below which more intense fire behavior may occur 22 . The applicability of our framework to these extreme fire regimes needs further investigation." (page 17-18 lines [464][465][466][467][468] o The mechanisms underpinning the relationship. I applaud the authors for including multiple relevant variables, but the use of machine learning largely obscures any mechanisms at play (the variable importance plots in the Supp Info notwithstanding). The authors describe this as process-based but I"m not sure this qualifies, as there is no explicit process. The authors do not provide any information on how they selected those variables (L518-520), or whether a different set might have performed better. Given the centrality of fire to this study, it would be good to see this addressed.
Response: Throughout the manuscript, we have clarified that "process-based" mainly means the use of Earth system models, for example, "Process-based Earth system approaches, such as the use of Earth system models (ESMs), have the potential to account for many human-vegetation-fire-climate interactions and are thus suggested as a practical way to predict future changes of wildfire and associated socioeconomic exposure (e.g., population, gross domestic product [GDP], and agricultural area)." (page 3 lines 51-54) The selection of driving variables has been justified in the revised manuscript: "Observed, historical environmental (e.g., fire carbon emission, leaf area index [LAI], soil moisture, temperature, precipitation, wind, relative humidity, flash rate, and orography) and socioeconomic (e.g., land use and population) variables (Extended Data Table 2 and reference therein) are subsequently fed into the trained MLT models, resulting in a multimodel, multi-data set ensemble of observation-constrained projections of future global distribution of fire carbon emissions for each decade. These driving variables of wildfires are selected so that their nonlinear combinations as determined by MLT reflect the fuel abundance 8 (LAI, temperature, precipitation), fuel moisture 37 (soil moisture, relative humidity, precipitation, temperature), fire spread conditions 38 (wind and orography), and ignition sources 39 (flash rate, land use and population)." (page 5 lines 128-137) o Quality of observations. It is worth acknowledging limitations in this data. Wind speed in particular suffers from a lack of long term high quality observational datasets against which to judge the accuracy of reanalysis values. Relative humidity, flash rate and soil moisture are probably also a fair way behind variables like temperature and rainfall in terms of their accurate simulation. See L409 Response: Thank you for this comment. In the revised manuscript, we have added one new set of global observation-based soil moisture products (Supplemental Data Table 2); also, we have elaborated more on the observational data uncertainty: "Although we analyze a spectrum of data sources for most climatic and ecosystem variables, the single data set used for lightning and socioeconomic variables, as well as the deteriorated reliability of reanalysis-based wind and specific humidity over observation-sparse regions 61 , likely leads to a weakened constraint gained from these variables." (page 17 lines 438-441) o Can you clarify that you have avoided the possibility of strong but "overconfident" constraints  Response: Based on the performance of our MLT-based observation-constraint during the historical validation period, we are confident that this framework effectively reduces model biases in the near future (i.e., using observations during 1997-2006 to constrain projections during 2007-2016), while this framework"s reliability in the longer term partially relies on the accuracy of physical processes contained in the ESMs and remains to be tested. We have clarified this point in the revised manuscript: "Benefiting from the inclusion of the complete spatial sample, this observational constraint leads to a consistent and substantial error reduction in simulated global wildfire distribution (Fig. 1), demonstrating the robustness of our analytical framework even with just 13 ESM ensembles. Indeed, our MLT-based framework also shows satisfactory efficiency in error reduction with only 6 CMIP6 ESMs that simulate burned area fractions (Extended Data Fig. 7). Although the current MLT-based EC framework improves the spatial accuracy of original ESM-simulated fire carbon emissions during the historical validation period, the performance of our framework in the future decades partially relies on the accuracy of ESMs' physical processes (e.g., complex responses of fire regimes to various natural and anthropogenic forcings) and must be further evaluated." (pages 13-14 lines 338-347) o Can you give an indication of reasonable performance of EC and its basis? Presumably the process by definition improves accuracy, so what is the best way to judge its effectiveness?

Response:
We have used historical data to test the out-of-sample performance of the MLT-based EC, and relevant results are shown in Fig. 1 and corresponding text (pages 6-7 lines 143-188).
-There are a number of simple errors in language and referencing which can easily be addressed, but somewhat undermine confidence in the rest of the manuscript e.g. o It is risks from / posed by wildfire, not risks to wildfire

13
-It is important to acknowledge that SSP-5 is not business as usual, but a (fortunately) less likely higher emissions scenario (Hausfather & Peters 2020). Ideally the authors would include a more realistic scenario as well, but at the least they could acknowledge this.
Response: Thank you for the good suggestion. To address the dependence of fire projection on emission scenario, we have applied our framework to 9 CMIP6 ESMs that provide SSP2-45 simulations (Extended Data Fig. 12) and have compared the results with the analysis conducted for SSP5-85 with the same set of models (Extended Data Fig. 13). The results and implications have been discussed in the revised manuscript: "The projected fire regimes and their socioeconomic risks depend on the projected socioeconomic pathway. The currently examined SSP5-85 reflects a high-emission scenario 58 , whereas a lower-emission scenario, SSP2-45, suggests a generally milder increase in global fire carbon emission, for both the original and observation-constrained ensembles (Extended Data Fig 12). In the northern subtropical and mid-to-high latitudes, while the default ensemble indicates a spatially homogeneous but slightly weaker increase in fire carbon emission in SSP2-45, compared with that estimated for SSP5-85 from the same set of ESMs (Extended Data Fig. 13 -The country based approach is definitely valuable (Fig 3) as is the Congo and Amazon zoom in (Supp Fig 9). However, I think you will find better relationships and more meaningful drivers if you focus on dominant fire types e.g. forest fire, grass fire, peats etc.
Response: Nice suggestion! The projected fire carbon emission trends and their drivers are added in Extended Data Fig. 11 along with the Congo and Amazon (Extended Data Fig.10). The results are discussed in the revised manuscript: "In particular, the observation-constrained ensemble projects increased wildfire activity over the Amazonian and Congo Basins, in contrast to the default simulation for Congo and to a larger extent for Amazon (Fig. 2b, c). The observation-constrained projection of pantropical enhancement in wildfire activities is likely affected by the changes in soil moisture and relative humidity (Extended Data Fig. 10), consistent with previous conclusions regarding accelerated drying over the tropics under climate change 46 and increased occurrence of severe tropical droughts [47][48][49] . Such apparent association between future drying and elevated fire carbon emission is also identified over other forest, grassland, and cropland, as estimated by the observation-constraint (Extended Data Fig. 11). In the Congo basin, the projected elevation in the amount of fuel 50 , partially reflected by the positive contribution of leaf area index trends to the future increased fire carbon emissions as indicated by the observational constraints (Extended Data Fig. 10b), further supports a more flammable future. The leading role of fuel abundance in future fire regimes also appears in other forest, shrubland, savannahs, and cropland (Extended Data Fig. 11). Because of the global, spatial sampling approach (see Methods), our constraining approach results in a much weaker contribution of projected local socioeconomic development (e.g., population density and land use) to the projected trend in fire carbon emissions than the default ensemble, for all major land cover types (Extended Data Fig. 11). Although the parameterized anthropogenic source and suppression of wildfires in ESMs reflects valuable efforts to represent socioeconomic influence on wildfire regimes, their accuracy and applicability to future scenarios remain to be rigorously evaluated. In this perspective, our MLT-based observation-constrained ensemble raises an alternative scenario of future evolution of fires in the Congo region-with relative weak anthropogenic suppression and/or more anthropogenic ignitions than that estimated by CMIP6 ESMs." (page 15 lines 378-400) (e.g., . While it could be argued that land-use accounts for forest fuels and that fuel moisture content is somewhat covered by temperature and relative humidity, the list of variables considered by the authors (Extended Data Thank you for suggesting adding the key references on driving processes of intensive fire behavior. We have included a statement of such processes in the revised manuscript: "However, the linkages between fire weather and wildfire activity are greatly affected by other factors, including terrain, fuel abundance, fuel moisture content, source of ignition, and their interactions [19][20][21][22] ." (page 3 lines 63-65) 2. How does Leaf Area Index (LAI) account for the extensive regions of grasslands, savannah and other important vegetation types around the globe (e.g., xeric shrublands, spinifex, etc.)?? Moreover, how does satellite-derived LAI account for fuels in the surface and near-surface layers of forests, noting that it"s the fuels within these layers (and their dryness) that have the greatest influence on fire occurrence and account for a considerable proportion of carbon release?
Response: Because of a lack of reliable, long-term observations on fuel abundance, our analysis uses LAI, precipitation, and temperature to reflect fuel abundance. Also, because of the lack of model outputs on fuel dryness, we have used temperature, precipitation, relative humidity, and soil moisture to reflect the fuel dryness. The reason why these driving variables were chosen is outlined in the revised manuscript: "These driving variables of wildfires are selected so that their nonlinear combinations as determined by MLT reflect the fuel abundance 8 (LAI, temperature, precipitation), fuel moisture 37 (soil moisture, relative humidity, precipitation, temperature), fire spread conditions 38 (wind and orography), and ignition sources 39 (flash rate, land use and population)." (page 5 lines 134-137) In the revised manuscript, we have also added the discussion on such uncertainty caused by observational data availability or modeling capability: "Second, the inconsistency between observed quantities and model-simulated variables limits further strengthening of our observational constraint. For example, the above ground biomass, as provided by most ESMs, more directly captures the amount of fuel than the combination of LAI, temperature, and precipitation, as used in our current analytical framework. Yet, a lack of long-term, reliable observational record of above ground biomass prohibits the direct use of such key driving variable in the current analysis." (page 17 lines 448-454) 3. Fuel moisture, which depends on air temperature and relative humidity, can have a significant effect on wildfire behaviour. For example,  note that when fuel moisture content drops below certain thresholds, wildfire behaviour can be noticeably different. This is due to greater propensity for certain types of fire behaviour, such as spotting. My concern in this respect, is that these thresholds are not incorporated in the approach used by the authors, and so their analyses may be missing important dynamics that could occur more often in the future as conditions become warmer. In particular, the methods used may not account for more intense fire behaviour, which could release substantially more carbon. The authors should note this as a limitation of their study and revise their conclusions accordingly.
Response: Thank you for the good point. The use of MLTs addresses the nonlinear dependence of fuel moisture on soil moisture, air temperature, and relative humidity to some degree, as outlined in the revised manuscript: "These driving variables of wildfires are selected so that their nonlinear combinations as determined by MLT reflect the fuel abundance 8 (LAI, temperature, precipitation), fuel moisture 37 (soil moisture, relative humidity, precipitation, temperature), fire spread conditions 38 (wind and orography), and ignition sources 39 (flash rate, land use and population)." (page 5 lines 134-137) We have also noted this as a limitation of our study: "Finally, the current MLT-based observation-constraint framework does not directly account for potential tipping points in fire regime evolution 59,65 or certain threshold in fuel moisture content below which more intense fire behavior may occur 22 . The applicability of our framework to these extreme fire regimes needs further investigation." (pages 17-18 lines 464-468) 4. The release of carbon from global fire constitutes an important climate feedback. Was this accounted for in the study? That is, were the carbon release estimates found in the study used to further inform the climatic projections used in a coupled manner? Again, if not, this should be noted as a limitation of the study.
Response: Thank you for pointing this out. Our approach provides an offline fire projection constraint and does not account for climate or ecological feedbacks of fire carbon emission. This limitation has been addressed in the revised manuscript: "Although the current approach does not account for climate or ecological feedbacks of global fire carbon emissions, dynamical coupling between observation-constrained fire carbon emissions and other components of the Earth system will likely result in a more reliable projection of all these components." (pages 14 lines 366-370)