Introduction

Fire is a global phenomenon that has existed since the emergence of terrestrial plants1, and is currently present across all vegetation types2. The long cohabitation of vegetation and fire has induced their co-evolution3, and has shaped adaptive strategies within different plant species. Our understanding of fire activity and its relationship with its influencing factors is lacking, especially at large spatial scales4, owing to the absence of consistent long-term data5,6. Currently, satellite data is the primary source for analyzing fire activity and studying historical and contemporary fire regimes globally owing to its comparability in time and space, despite its limited-time series length7. Although studies characterizing fire activity at the regional level are common8,9,10,11,12,13,14, the lack of long temporal series has limited the study and assessment of global fire activity and its potential influencing factors7,15,16,17,18,19,20.

Several researchers have utilized global forest fire data to investigate various topics, including evaluating the impacts of fire on vegetation and emissions, as well as the factors influencing spatial and temporal fire activity variations4,21,22,23,24,25,26. It has been demonstrated that changing environmental conditions and human activities can and will continue to modify fire activity in several parts of the world. However, these findings have been primarily based on regional-scale studies, which do not always reflect the global reality, and its variability in regards to either influencing factors or societal perceptions27. The complex interrelations between the environment, climate, human factors, and fire dynamics make any related study challenging on a global scale.

The relationship between vegetation and fires in a specific ecosystem is characterized by a fire regime3,15. A fire regime is defined as a set of consistent and repeated wildfire conditions in a particular region over a long time5,28. These regimes depend on a combination of factors influencing fire occurrences and characteristics, such as ignition conditions, fuel compositions and arrangements, and weather conductive to fire29. The fire conditions that define a fire regime5 have a clear impact not only on the presence of certain vegetation types, but also on the soil and atmospheric characteristics7. The combinations of the different factors influencing fire activity are expected to change unevenly across the globe, generating diverse patterns of the occurrence and severity of fires25. The majority of terrestrial ecosystems are familiar with fire, and in many cases, rely on its effects to maintain their natural dynamics and health30,31. Yet, significant changes in fire activity may lead to unwanted consequences for human communities and ecosystem conservation practices16,32. Moreover, although the presence of a certain fire regime depends on a particular combination of climate, vegetation, and human activity15, similar regimes can appear even with a different combination of factors, making future wildfire predictions a difficult task21.

The interpretation of the concept of fire regimes has changed over the years, and remains a matter of discussion in the present. In one study28, the authors provided insight into the discussion by dividing the present interpretations into (a) studies that consider that a fire regime or its evolution reflects a core group of parameters reflecting where, when, and which type of fires occur in a certain area (e.g., chronology, seasonality, time since fire, size, intensity); (b) studies that use the factors influencing the occurrence and characteristics of fires to frame fire regimes (e.g., fuel characteristics, fire weather, ignition sources, potentially related events); and (c) those that use the immediate impacts of fires on ecosystems and humans to determine the consequences and spatial context of a fire regime (e.g., ecological severity, damage to society). These approaches are often not used independently, and combinations between them are frequently used to provide a more comprehensive understanding of the complex relationships between fire activity, its driving factors, and the impacts on ecosystems33. However, when studying fire regimes at a global scale, there is a clear barrier, as no specific spatial frame has been set to those regimes (i.e., it is non-trivial to delimit those specific regions and achieve a level of homogeneity of fire characteristics, a clear pattern of landscape vegetation, and/or a distinctive association to other factors); nevertheless, they should combine similar characteristics in regards to fire activity, driving factors, and impacts on ecosystems. An attempt to overcome this limitation has been to define “pyromes” and to evaluate how frequently they occur within a biome7,15. Still, neither the use of pyromes, biomes, or the cross-evaluation of pyromes within a biome (and vice versa), provides a clear regional delimitation. Therefore, as science has moved from the concept that a fire regime is just a homogeneity of fire activity over a period of time in a certain area, i.e., to accommodate new visions regarding influencing factors and cohabitation with a vegetation type, this evolution has also limited its capacity to clearly delineate the extent of a fire regime.

An increased understanding of historical and contemporary fire regimes will provide an essential tool for knowledge transfers between regions that share a regime. As a result, any study on the factors altering the evolution of fire activity or its impacts will be majorly improved if implemented based on fire regimes7,15. In addition, the knowledge gathered could provide useful insights and improve various studies, e.g., by providing relevant information for characterizing and assessing the impacts of contemporary fire regimes on ecological aspects such as vegetation adaptability, soil degradation, carbon stocks, air quality/pollution, and conservation of the biosphere34,35,36,37. Moreover, the potential evolution of the regimes, their expected impacts on human communities38,39,40,41, and measures for mitigating future threats or for restoring desirable conditions prior to a regime change30 could be determined. Therefore, characterizing global fire regimes—historical and contemporary—affects multiple areas of knowledge.

Advances in generating global-scale fire data19,42,43 and in its accessibility44 can be combined with artificial intelligence and other processing techniques to unveil previously unseen patterns in environmental phenomena, based on processing large and complex data. This is the key to providing better understanding of the current and future statuses of fire dynamics worldwide. Despite its current limitations, some of the premises of the use of large data45, such as large data storage management, the use of innovative processing methods to solve multidimensional problems, and the sharing of data and information, can facilitate the understanding of complex problems, such as identifying fire regimes across the globe.

Here, we present a comprehensive study comprising the understanding and characterization of contemporary global fire regimes with an statistical framework that analyzes yearly global wildfire events over 19 years from vectorial 500-m resolution datasets. We consolidate these data to a resolution of 1 × 1 degree grids covering the entire planet, so as to calculate annual statistics on fire characteristics (see Methods). Next, we derive a global classification of fire pyromes, areas with similar fire-related characteristics with unsupervised machine learning. This grouping step distinguishes our analysis from the majority of existing categorization methods, as we make no prior assumptions regarding the spatial positioning of each group, similar to new regional research studies46 using a non-spatially constrained unsupervised learning approach. The investigation results in the identification of pyromes with major geographical dispersions and complexity. This leads us to exploring an extended spatial assessment of their core areas to evaluate variations in the seasonality of fire activity, and to determine the key underlying factors47,48 defining specific contemporary fire regimes, such as climate, land use, and socio-economic factors (novel in this kind of analyses). We propose a spatial delimitation of modern regimes (i.e., with contemporary data from 2000–2018) within the pyromes by defining areas with similar fire activity and selecting underlying factors, thereby providing a basis for understanding fire regimes or the distribution of fire across the globe (and its underlying factors).

Results and discussion

Determining contemporary fire regimes

The proposed framework is presented in Fig. 1, emphasizing the knowledge discovery and extraction process of the study. The core operations are performed at the third stage of the framework, where unsupervised learning algorithms and techniques for processing multidimensional data, such as self-organizing maps (Fig. 2), are applied to determine patterns from the data and to validate the interpretability of the obtained results (see Methods section).

Fig. 1: Overall framework of the study.
figure 1

a Wildfire data describing individual events in terms of fire-related characteristics such as size, perimeter, duration, and average expansion are collected from products derived from Moderate Resolution Imaging Spectroradiometer (MODIS) satellite observations. b The data are processed and consolidated into a raster dividing the world into a grid with a resolution of 1 × 1. Annual statistics and features are calculated for each cell, generating numerical (e.g., average fire frequency per time period) and spatial datasets. c Statistical methods to analyze multidimensional data are combined with unsupervised learning in order to discover similar groups of cells sharing fire-related characteristics. No explicit spatial components are included in this step. d Climatic and socio-economic layers are introduced for each cell in the grid. e Spatial density plots are generated for each pyrome, detecting the regions of the world with more observations, assumed to spatially frame a specific regime. The detected fire pyromes and regimes are characterized using climatic and demographic data. An evaluation of the influencing factors is performed for the most relevant areas. A temporal analysis to determine the trends and seasonality patterns of fire activity is also conducted. f All results and generated datasets are deployed on cloud services and a public-access repository, along with the scripts to reproduce all steps of the study.

Fig. 2: Discovering pyromes.
figure 2

Self-organizing maps are useful for summarizing multidimensional fire data and for determining potential groups of similar characteristics. These data are reduced to a two-dimensional grid, and the samples are organized according to their Euclidean distance. Observations sharing similar characteristics are easily visualized in a topographic map (a) where warmer colors represent widely separated samples, and cooler colors depict closely related values. Using image processing algorithms (see Methods), we detect relevant potential pyromes/clusters (red circles). The number of observations belonging to each section of the map can be presented in a matrix known as a hit-map (b). As an example, we can easily observe the group of cells without fire activity as a large dark blue region (top) and white valley (bottom), representing a relevant percentage of the observations.

First, the cells covering the world are classified into different pyromes, based on historical fire characteristics and observed patterns within their locations. The observed fires are characterized by six variables: the average frequency [fires/year] (0.30–1175.74), size [km2] (0.28–511.61), perimeter [km] (2.25–102.79), duration [days] (1.34–12.81), expansion [km2/day] (0.23–18.23), and ratio between the perimeter and the area [km−1] (0.78–8.45). Ultimately, 15 clusters are determined, defining general fire pyromes (Fig. 3) distributed across the globe (Fig. 4). These pyromes significantly differ in the mean characteristics of their fires and in their spatial distribution (e.g., ecological severity, damage to society 1–15 for details per pyrome and regime).

Fig. 3: Pyromes’ hierarchy.
figure 3

A dendrogram summarizes the hierarchy of the determined 15 pyromes and the similarities between them. As observed, the pyromes can be collapsed into six macrogroups sharing fire characteristics, consistent with our statistical results (see Supplementary Methods), where, for example, the pyrome of observations representing extreme and rare events (R10) is clearly independent of other clusters.

Fig. 4: Fire pyromes.
figure 4

Cells covering the gridded world are classified into the six macrogroups determined in the dendrogram (different colors), as composed by the determined pyromes based on historical averages of fire characteristics, including the average annual burning frequency, size, perimeter, duration, daily expansion, and perimeter-to-area ratio values. The white and light brown backgrounds represent sections of the world where no fire events were registered for the studied period.

These pyromes can be further condensed into six relevant macro-groups sharing clear fire characteristics (Fig. 3): very large, fast-spreading, and frequent wildfires (R10); large and frequent fires (R11, R2, and R4); medium-sized, slow-spreading and infrequent fires (R7, and R14); small, medium-to-high frequency, and long-shaped fires (R13, R0, and R5); small and infrequent fires (R9, R1, and R8); and small/medium and very frequent wildfires (R12, R3, and R6).

Further analysis of the different pyromes based on their fire sizes, frequencies, and expansion rates reveals that the R10 pyrome, mainly distributed across northern Australia and southern parts of Africa, is defined by the largest and fastest expansion fires, with an average fire size of 511 km2 and an average expansion of 18 km2/day. Following the ranking of the fire size but with medium-to-high occurrence frequencies (average of 172 fires per year) are the R11, R2, and R4 pyromes with mean fire sizes of 107, 34, and 24 km2, respectively. These four pyromes are often spatially associated with each other and the region covered by the R10 pyrome. The R11 pyrome regions usually surround the most fire-affected R10 areas. In some cases, the surrounding R11 cluster is accompanied by milder R4 and R2 pyromes, even though the latter pyromes generally occur in the Central African region, Brazil, and Kazakhstan. This spatial pattern of the observed fire activity matches with the gradient of the environmental conditions, a common process for several ecological phenomena49.

Pyromes R7 and R14 have average fire sizes of 34 and 9 km2, respectively, but have significantly lower frequencies than the previously mentioned pyromes. The occurrences of R7 and R14 show similar patterns, being mainly distributed across the boreal forests of America and Asia. A heterogeneous macro-group was created from the R13, R0, and R5 pyromes, consisting of small-to-medium fires with sizes between 2 and 5 km2, with medium frequencies. For R0 and R8, there are 31 and nine fires per year, respectively. In the case of R13, there are 307 fires per year on average. These pyromes are distributed across most of Europe, Asia, and America, in zones warmer than those where R7 and R14 are distributed.

The R12, R6, and R3 pyromes comprise a more spatially compact category, defined by fires with small-to-medium sizes (2.94 to 4.7 km2) and very high frequencies (598.62 fires per year on average), occurring almost exclusively in the tropical areas of Africa and in South Asia. Finally, regimes R8, R1, and R9, with average fire sizes of >2 km2 and low frequencies (lower than two fires per year for R1 and R9, and 9.44 fires per year in the case of R8), are widely distributed across the world; they are particularly abundant in both cold and dry vegetated regions, as well as in wet evergreen tropical forests. From the above-mentioned analysis, it can be seen that the characteristics of the fires and their distributions confirm that there are relationships between fire activity and vegetation, climate, and socio-economic factors. However, the distribution of the areas corresponding to the majority of the fire pyromes is rather dispersed (Fig. 4), and no single combination of factors seems to explain the occurrences of individual contemporary regimes.

When comparing our pyromes distribution with the ones delineated in previous studies, e.g., Archibald et al.15, we found similarities even when we are using a different set of variables to define fire characteristics. When comparing Archibald’s five pyromes with our six macro-groups (see Fig. 4), we could clearly identify similarities: (i) in the allocation and characteristics between their “FIL” and our pyrome macro-group composed by R2, R4, and R11, plus the addition of the standing alone R10 pyrome; (ii) their “ICS” pyrome behave similarly to our R0, R5, and R13 macro-group; (iii) their “FCS” could be compared with our R3, R6, and R12 macro-group; (iv) their “RIL” pyrome can be related to our macro-group defined by R7 and R14 pyromes; (v) and their “RCS” pyrome matches our R1, R8, R9 pyromes macro-group. Although those similarities between fire characteristics and their allocation could be expected for a subset of them, some relevant discrepancies could also be observed. The more important being that in many areas where we defined pyromes with a lesser fire activity (e.g., R1), Archibald’s study did not identify any pyrome.

Next, we explore the spatial distribution of the pyromes across the globe, in an attempt to define contemporary fire regimes. For this, we determine the most relevant (largest and densest) disjoint regions belonging to each pyrome. Fire regimes are determined by a spatial density analysis (see Methods), indicating the regions of the world where the observations belonging to each pyrome are mainly distributed. We denote the regimes by a combination of the pyrome alias with letters (a, b, c, d, and e) associated with their areas (decreasing order), i.e., R1-a denotes the largest regime of pyrome R1, and R2-c denotes the third-largest regime belonging to R2 (see Figs. 5, 6). Interestingly, although the contemporary regimes within a pyrome share historical fire patterns, important differences can be observed in those regimes in terms of location, climate, and socio-economic variables (Fig. 5), and in the proportion of dominant land cover50. Here, we include socio-economic descriptors for the per capita gross domestic product (GDP) [USD], population density [individuals/km2], accessibility [minutes], and climatic conditions (historical temperature [C], precipitation [m], water deficit [mm], and Palmer drought severity index (PDSI)51) (See Supplementary Table 16 for details). Similarly, different seasonal patterns and trends of fire-related variables (such as frequency and average size) are observed based on a time series analyses within the areas of different regimes, as represented for the five largest regimes related to pyrome R1 in Fig. 6.

Fig. 5: Contemporary regimes and driving factors.
figure 5

The demographic and climatic variables of the most relevant regimes per pyrome are summarized into two independent components using a principal component analysis (PCA) to visualize the regimes' driving factors (a.i, b.i, c.i, d.i, and e.i panels). A denser population, lower GDP, and harder accessibility characterize regimes located in higher values on the demographic component (y axis). Regimes located in higher values on the climatic component experience higher average temperatures, higher precipitation levels, and lower PDSI values (x axis). A shaded region connects regimes within the same pyrome, highlighting the dispersion of the regimes' potential driving factors within each pyrome. The spatial location of the regimes (a.ii, b.ii, c.ii, d.ii, and e.ii panels, matching the colors of the regimes in the PCA plots) and the proportion of the dominant land covers per pyrome (a.iii, b.iii, c.iii, d.iii, and e.iii panels) are included for reference, providing an overall comprehensive characterization of the most relevant contemporary regimes (62 in total) for each pyrome. Detailed results per regime including their dominant land-covers can be found in Supplementary Tables 115.

Fig. 6: Characterizing contemporary fire regimes.
figure 6

The five largest spatial regimes are determined and represented with different colors for the R1 pyrome after a kernel and contour level analysis. Despite belonging to the same pyrome, dense observation areas are spread in regions with different climatic and demographic characteristics. In this pyrome, regimes cover parts of the western coast of Canada and Alaska (dark green), the Amazonian forest of Peru and Brazil (blue), the North America great lakes area (green), the eastern extreme of Russia (orange), and the central Asia region (gray). Notable differences can be observed between these regions in a location and landcover distribution; b socio-demographic and climatic attributes; and (ce) seasonal fire characteristics per regime including: c monthly average fire size with 95% confidence intervals shown as shades, d yearly fire frequency, and e monthly average fire frequency. The land covers observed in the pyrome include savannas (WDS, SAV), grasslands (GRS), conifers (ENC), evergreen broadleaf palmate (EBP), shrublands (OSL), mixed forest (MFS), water bodies (WBS), non-vegetated (NV), croplands (CRO), and permanent wetlands (PWL).

When evaluating the relationships between the locations of contemporary fire regimes within the identified pyromes and potential driving factors (i.e., climate and socio-demographic conditions), different patterns and trends can be observed. However, owing to their marked variability, there is no conclusive evidence identifying the main factors defining each pyrome. Therefore, we analyze the regimes within each pyrome. By applying an independent principal component analysis (PCA, see Methods) to the climatic and socio-demographic features characterizing each pyrome and regime, we can project them into a two-dimensional map, to thereby identify relevant patterns. By plotting and analyzing their demographic and climatic components (Fig. 5), we observe the following. First, regimes associated with low fire activity pyromes (R1, R8, and R9) show similar dispersion and trends for their demographic and climatic components. These regimes are commonly located in dry, cold, and sparsely populated areas, with the exception of the regimes associated with the R8 pyrome. These regimes often appear in densely populated regions of the world, and under less extreme weather conditions. In all cases, no common land use is found to define the fire activity. No single pattern is observed, with the land covers varying from a predominance of evergreen forest in warmer areas, to grasslands and sparse vegetation in colder areas; in milder and more populated areas, where it could be assumed a more fragmented landscape52. Second, regimes experiencing large fires (within pyromes R2, R4, R10, and R11) are often located in balanced areas where no predominance of any of the demographic or climatic components can be observed, but are focused on colder regions for regimes within the R4 pyrome, i.e., the pyrome with the milder fire activity of the group. The most common land uses within the areas of these regimes are grasslands, savannas, and shrublands. Third, regimes belonging to pyromes where large but slow-spread fires are present (R4 and R7) are always framed by cold, dry, and non-populated areas, and are covered mainly by savannas and shrublands. In this case, the regimes and the associated pyromes share very similar climatic and socio-economic characteristics. Fourth, regimes where small but very frequent fires are characteristic (pyromes R3, R6, and R12) are located in densely populated areas—likely increasing the probability of a ignition—with warm and moist conditions, with less than average incomes, and where the dominance of savannas and shrublands—highly flammable land covers—is common. However, a singularity can be observed in South Asia, where the observed fire conditions are related to the more diverse landscapes in which evergreen forests, savannas, and croplands cohabitate. Fifth, regimes associated with medium-small size fires with far less frequency than the previous types tend to be present under a wider range of conditions, as observed in their associated pyromes (R0, R5, and R13). These regimes can occur in areas dominated by grasslands, croplands, and savannas, but also in large areas covered by mixed or evergreen forests, leading to a higher level of landscape fragmentation. In general, we note that low population densities are common in regimes located in dry and cold regions. In contrast, the density of the human population increases sharply when these regimes are located in warmer and moister regions.

Comparing our results in terms of underlying factors to previous studies can be misleading. The comparison of the analysis of regime areas grouped by pyromes (Fig. 5), even if delimited based on spatial density, can provide a basis for comparison, being the most similar approach to the one applied by the studies of Archibald et al. and Chuvieco et al.7,15. When comparing the factors influencing the occurrence of our contemporary fire regimes, we can observe similarities in the prevalence of grasslands and shrublands from our corresponding pyromes and the “FIL” and “FCS” pyromes reported in Archibald et al.15. In these areas, although we observe higher relevance of the climatic component in both studies, in our case, the regimes within the R2, R4, R11, and R10 pyromes are distributed across average values of the demographic component, as opposed to Archibald’s. Similarly, their research reports a variety of biomes within their “ICS” pyrome, a characteristic that can also be observed for our R0, R5, and R13 macro-group. Regarding other influencing factors, such as climate and human-related factors, the comparison becomes more challenging since we explored the combination of variables through principal components in our analysis of grouped fire regimes. Still, we observe that the results of Archibal et al., in general, lead to a much higher level of overlapping and dispersion of influencing factors between pyromes than our approach, translated into less homogeneous and compact (i.e., more difficult to interpret) results than our study.

Although an analysis of the combined driving factors based on regimes provides a deeper understanding of the variability of the conditions associated with these regimes, it still captures only a portion of the full picture required to understand both historical and contemporary fire regimes. To fully accomplish this goal, the best approach is to independently observe each of the regimes and their potential influencing factors. As an example, we explore the contemporary regimes within the R1 pyrome (Fig. 6), where the differences between regimes are especially clear between the Amazonian regime, denoted in blue, and the two regimes located on the northern hemisphere, denoted by different shades of green. In this example, we found that the largest regime (R1-a), located in the western part of North America, shows low fire activity (1.63 fires per year on average), as driven by the cold temperatures and sparse vegetation on the inland parts of the area (and likely by intense suppression in the coastal zones). The Amazonian hotspot (R1-b) is influenced by a very high moisture (PDSI of 75.39, 1.23 m of average yearly precipitation) owing to rainfall and closed vegetation coverage. The hotspot located in the area of large lakes (R1-c) is characterized by a heterogeneous landscape of mixed forests (16.8%) and croplands (35.9%), which, together with suppression policies, may justify its limited fire activity. In regards to the two regimes located in Asia, the regime on the eastern part of Siberia (R1-e, orange) is characterized by cold weather (an average of −8.88 C throughout the year), a low population density (0.056 individuals per km2), and sparse vegetation. Meanwhile, the area in central Asia (R1-d) is clearly defined by a lack of water (average water deficit of 415.79 mm) and the absence of large plants (86% of the land is covered by a combination of grasslands and non-vegetated areas). Considering the omission of seasonality from the metrics used to define the pyromes, differences in this regard can be observed a posteriori among the regions. Whereas some subgroups show a clear tendency to aggregate the number of fires and areas burned in summer in the Northern Hemisphere, the Amazonian regime shows a lesser tendency to aggregate over a specific season, with the peak fire activity between December and January, but still occurring in the rest of the year. The eastern Siberian regime also shows a different temporal pattern, with most of the fires in spring, but with those of larger size occurring during summer (see Supplementary Tables 115 for a comprehensive description of all pyromes and contemporary regimes).

The results show that different combinations of climatic, vegetation, and human factors may lead to similar fire patterns in different regions of the world. However, changes in these factors may induce shifts in fire activity in nearby locations53. Nevertheless, it is possible to determine clear gradients of fire activity if one of the influencing factors, especially climate or vegetation, changes accordingly. This spatial gradient is clear in regimes with little fire activity, and is limited by the low temperatures and scarce vegetation in the northernmost boreal areas. As an example, we observe some regimes associated with pyrome R9 shifting into regimes characterized by the fire activity of R1, as well as fire-related conditions becoming more conducive to regimes belonging to the R7 or R14 pyromes. Similarly, regions with larger and fast-expanding fires (e.g., the characteristic of R10) are surrounded by pyromes sustaining smaller but still large fires (R11, surrounded by regimes within R4) if the climate and vegetation gradually become less hazardous. It is also possible to identify shifts on fire activity in nearby areas caused by non-natural processes. For example, in the limits of the Amazonian forest (R4-c), continuous coverage of the rain forest has been disrupted13,54,55, leading to frequent fires of considerable size. Other natural processes can be observed across Asia, where slight differences in climate (and subsequently in vegetation), such as increased rainfall (R4-a, in Kazakhstan) or higher temperatures (R0-a), modify the recurrences and sizes of fires, following evidence from existing literature (see Supplemental information Table 16 for details). Despite the increased fire activity owing to deforestation processes, and arid environment such as western Australia (R4-e) is also able to sustain recurrent fires after stabilization of the allochthonous vegetation56. Upon analyzing the driving factors impacting fire patterns, we have observed that, in general, they align with the expected effects described in relevant literature (e.g., refer to Supplemental Information, Table 16). It is important to note, however, that no pyrome or contemporary regime can be solely attributed to the impact of a particular driving factor in isolation. This observation underscores the inherent complexity in analyzing and understanding fire regimes. Therefore, it can be stated that there is a clear but complex relationship between the distribution of the driving factors and fire activity. Studies based on the impacts of changing factors on fire activity provide an important source of information regarding the past and future changes of fires and their impacts18,57,58, but only an understanding of the interactions between the factors in an area corresponding to a fire regime and comparisons between regimes will provide the whole picture, i.e., to understand why a regime is present in a certain region and not in another. Moreover, a similar approach should be used to predict possible shifts in regimes that, as has been shown, can hardly be interpreted based on the change of a unique factor.

The proposed methodology can spatially frame a fire regime (contemporary or historical, depending on available data) in a specific area, with a high-level of fire activity homogeneity, vegetation characteristics, and underlying vegetation factors. Constraints owing to the different interpretations of fire regimes28 have restrained this delimitation process on a global scale. This point is crucial, as it is recognized that understanding fire regimes is key to understanding present and future fire activity and its impacts on the earth. The most limiting constraint for delimiting fire regimes (historical or contemporary) at the global scale is the coupling of the fire activity and the associated vegetation. Contrary to a vast number of previous investigations, this study did not assume that a single vegetation type must be associated with a fire regime, as fire controls are heterogeneous across landscapes, and the spatial configurations of vegetation and fuels define the characteristics of a fire regime on a landscape59,60. On a global scale, assuming that a mosaic of vegetation groups does not influence or comprise part of a fire regime seems to be a limiting premise. In this context, we believe that a global mosaic of vegetation types and other land covers, even if coarse, should also be considered as a part of what defines a regime, and not as a constraint. Moreover, if fire regimes are to be studied, they need to be framed spatially, and neither the results from defining equal fire pyromes nor the use of single vegetation groups can solve one of the most essential requirements for setting a fire regime: defining the specific region where the fire characteristics are repeated over time5,28.

Conclusions

Understanding and characterizing wildfire regimes is crucial knowledge for a better understanding of both the temporal and spatial impacts of fire on the evolution of ecosystems61, and on the several services that are provided by them62. This comprehensive understanding remains an open question, subject to future research.

We present a open source framework and classification system that help to uncover contemporary (i.e., with recent 19 years of data) fire regimes and their most common regions in a systematic way, without assuming that the geopolitical borders or climatic characteristics of vegetation biomes are a constraint when framing their influencing areas. It can reveal similarities and dissimilarities between fire pyromes and their underlying factors, incorporating novel aspects into the comprehensive analysis such as socio-demographic variables. This method aims at helping specific studies and adaptation measures across the globe, planned according to the relative importance of the driving factors and their expected changes. This type of assessment, as shown in this study, requires a two-step clustering process. One step was based on the fire characteristics alone, and the second step focused on the spatial distribution of those fire characteristics. Without splitting a fire pyrome into spatially framed regimes, understanding the underlying factors that cause such specific fire characteristics becomes difficult, and may produce inconclusive or even misleading results. Maintaining, restoring, or mitigating fire activity depends on the understanding of the current natural fire regimes, and their changes under future conditions. This knowledge cannot be based on global generalizations, and should be based on site-specific recommendations. Site-specific considerations other than climate should be embedded into the models. Similarly, any global study related to fire activity or its future change should integrate the combination and interrelation of all influencing factors, often regionally specific, without assuming global oversimplifications on the impact of a single factor or its changes. Yet, delimiting fire regimes in the space, studying them individually, and comparing distant regimes with similar fire activity and/or adjacent fire regimes with differences in fire activity may be the key to truly understanding the complexity of fire regimes.

Our vision conflicts with some fire regime conceptual approaches based on the direct crossing of fire syndromes with biomes or vegetation types, as a way to study fire regimes, as it does not reflect a combination of fire characteristics on a specific vegetation type. However, it solves the issue of defining well-delimited regions with a high-level of homogeneity on all fire-associated characteristics and driving factors, that other approaches do not. An additional issue that may arise from our study is the inclusion of all types of fires. Agricultural fires, often associated with those fires occurring in croplands, tend to be considered a source of noise when studying global fire characteristics. This, due to limitations of MODIS satellite data to capture small fires63 or their specific relation with human production culture, distinct to the rest of wildfires. Still, we decided to include them in the study based on the importance of croplands in terms of global land coverage and the predominance of fires on croplands across many regions of the world63,64. Another reason is the historic importance of human behavior, including the use of agricultural fires, to understand the evolution of fire regimes39,65,66, and hence, the contemporary fire regimes delimited in the study. Finally, as previously mentioned, our approach does not confine a regime to specific vegetation types, but defines areas where different land uses and vegetation types coexist to define a landscape. Croplands, therefore, should not be excluded from the landscape, because they are a fundamental part of it. Moreover, agricultural fires are not always controlled and may escape to neighboring areas covered by other vegetation types39, and even if croplands are assumed to increase landscape fragmentation and reduce fire propagation due to their lower fuel loads67, they can still propagate wildfires depending on the seasonality and weather conditions66. Therefore, the results presented in this study need to be interpreted within this context. For example, the inclusion of cropland fires can lead to overestimation of fire occurrence and biased representation of fire characteristics (e.g., ignition sources, timing, and duration). Conversely, excluding these fires can result in underestimation of overall fire frequency and incomplete understanding of fire severity. This bias can affect the accuracy and reliability of the findings (e.g., not even identifying the area as part of a relevant pyrome or contemporary regime) and may limit their applicability to other ecosystems. In addition, the high omission error associated with these fires can make it challenging to accurately assess the frequency and intensity of fires in these areas. Future directions may involve, e.g., using different methods to estimate fire occurrence and fire characteristics in cropland areas, or adjusting the pyromes to account for the lower representation of these fires, among other approaches.

The temporal extension of our current harmonized dataset should be considered enough to capture fire characteristics and delineate either pyromes and the proposed contemporary fire regimes. By expanding the dataset temporal dimension in future iterations, it will be possible to identify changes in fire activity and underlying drivers (see Methods, Data description section for details about potential dataset additions). Even more, if data sources other than remote sensing are contemplated39, it could be possible to expand our knowledge on past fire activity and reconstruct the impact of large-scale global changes on fire regimes39,65.

Methods

Data description

Two global datasets containing observations of individual wildfires in vector formats were obtained from the Moderate Resolution Imaging Spectroradiometer (MODIS) MCD64A1 collection 668, with an underlying resolution of 500 m were used in this investigation. This collection, although a well-known and commonly used product, remains limited. It suffers from a negative bias when estimating the total area burned at a global scale and systematically underestimates smaller burned areas, leading to a limitation to the dataset and to our study. This limitation is characterized by a high proportion of commission and omission errors (40.2% and 72.6%, respectively) compared to a Landsat 8 Operational Land Imager image at a resolution of 30 m69. The lowest errors (24% and 27%, respectively) are observed in the boreal forest biome characterized by large, persistent, and well-defined wildfires, whereas the highest error levels are observed in the tropical forest, temperate forest, and Mediterranean biomes (>50 and >90%). However, both errors are found to be significantly compensated for when evaluating the accuracy of the product at coarser resolutions69 (e.g., on the order of km), as in the one used in our study.

The Global Fire Atlas19 from NASA (https://daac.ornl.gov/cgi-bin/dsviewer.pl?ds_id=1642) provided us with ~13.3 million individual wildfires between 2003 and 2016. This dataset contains information regarding the timing and location of ignitions, fire sizes, durations, daily expansions, fire line lengths, average speeds, and dominant directions during the spread. Individual wildfire samples between 2000 and 2018 were obtained from the GlobFire44 Database (https://doi.pangaea.de/10.1594/PANGAEA.895835). This dataset provides similar attributes to the Global Fire Atlas set (ignition, duration, size, daily burnt areas, etc.), and the same features can be obtained via simple calculations.

Although these two databases represent the best available datasets depicting individual fires and their characteristics, they have limitations. A known limitation of both datasets is the over-fragmentation of wildfire events44, where unique wildfires are split into multiple fires, owing to the way these datasets are generated. In the case of the Global Fire Atlas, it applies a fire event identification algorithm at a tile level (obtained from MODIS data). This approach leads to an artificial splitting of the fires following the pattern of the grid tiles generated by MODIS when a fire spreads over multiple tiles. The GlobFire Database attempts to correct this limitation by introducing an algorithm that runs independently of the grid tile and source, allowing it to avoid splitting unique wildfires covering more than one tile from MODIS. Despite this improvement, it still suffers from this known limitation. Therefore, the results presented in this study are affected by this limitation, which biases fire characteristics such as the average frequency and average wildfire sizes (depending on the intensity of this effect in different sections of the world). In practice, we managed with this limitation and partially mitigated it by consolidating both datasets. We deleted duplicated wildfires and identified as many fragmented fires as possible by comparing overlapping events between both datasets. Then, we selected the entries with the minimum number of wildfires, to minimize the impact of this limitation in the analysis. Finally, we generated a global raster summarizing the yearly observations (2000-2018) and statistics for multiple resolutions (0.05–1), focusing the study on the 1 product. We tested different resolutions in this interval in addition to 1.25, and 1.5 degrees. We found that 1 degree optimized the trade-off between accuracy, quality, and stability of the results and total computational time required to process and perform spatial operations with the global dataset. We did not see major differences in the outputs (i.e., pyromes and fire regimes predicted and their characteristics) when comparing the (0.05–1) datasets. However, decreasing the resolution (i.e., using 1.25 or 1.5) did impact the quality of the results, leading to non-robust, non-convergent, and method-dependent outputs.

We also note that, although MODIS-derived products are widely used and considered accurate for global analysis70, incorporating other burned areas databases, such as FireCCILT11, can enhance the current study by extending fire activity records up to 1982. Recent efforts to improve AVHRR-derived products71 have addressed some of their shortcomings, making them valuable for studying fire activity in the period between 1982 and 2000, thus expanding the study period by 18 years. Therefore, their inclusion would allow for the investigation and detection of potential swifts in fire activity (e.g., to study the dynamics of the fire regimes/pyromes) that are currently out of the scope of this study, leading to new research questions. However, given the scope of our study, which mainly focuses on providing a contemporary delineation of fire regimes and not on their temporal evolution, and, given the exposed qualities of the dataset, we chose to base our analysis primarily on the exposed MODIS-derived products. In this line, we also note that combining fire products could result in problems/biases, such as differences in spatial resolution and accuracy issues related to AVHRR-derived products, which may lead to over-representation of fires72. By avoiding mixing these products, we aim to provide a more accurate and focused analysis of contemporary fire regimes, leaving the integration of multiple products and expansion of temporal analysis for future research.

Each wildfire and its attributes were projected onto a grid according to the selected resolution, and we calculated the average frequency [fires/year], size [km2], perimeter [km], duration [days], expansion [km2/day], and the ratio between the perimeter and the area [km−1] at the cellular level. A latitudinal correction was automatically applied (to all layers) to account for high-latitude area perturbations by identifying the corresponding Coordinate reference system (CRS) projection. In addition, non-flammable areas including the poles and large concentrations of water (e.g., oceans, lakes) were filtered from the grid, to avoid bias in the analyses and comparisons between regions.

One fire characteristic used in related research and not considered in our study to define pyromes is known as the fire radiative power (FRP), often used as a proxy of fire intensity15,73,74. Although there is an expected correlation between variables like size and expansion of fires and FRP73,74, it is possible that its inclusion in our framework may add additional refinement to the classification of pyromes, further refining and expanding the results and conclusions of our research.

The climatic, land cover, and demographic layers were obtained from global reanalysis projects providing gridded monthly and yearly data between 2000–2018. The average, maximum, and minimum 2 m temperature, and accumulated precipitation at a resolution of 0.25 produced by the European Centre for Medium-Range Weather Forecasts ERA 5 reanalysis (https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels) were summarized as yearly data. We used the TerraClimate dataset (http://www.climatologylab.org/terraclimate.html) to extract the PSDI and climate water deficit, with a resolution of 2.5 arc-min. Annual land cover data with a resolution of 500 m was obtained from the MODIS/Terra + Aqua Land Cover Type Yearly L3 Global 500-m grid (MCD12Q1) Version 6 data product (https://lpdaac.usgs.gov/products/mcd12q1v006/), which provides annual global land cover types. The world was classified into 17 different land covers including: water bodies, evergreen needleleaf forests, evergreen broadleaf forests, deciduous needleleaf forests, deciduous broadleaf forests, mixed forests, closed shrublands, open shrublands, woody savannas, savannas, grasslands, permanent wetlands, croplands, urban and built-up lands, cropland/natural vegetation mosaics, non-vegetated lands, and sparsely vegetated areas.

The estimated population densities (number of people per square kilometer) for the years 2000, 2005, 2010, 2015, and 2020 (used as a proxy for 2018) were extracted from the Gridded Population of World Version 4 (GPWv4), Revision 11 dataset (https://sedac.ciesin.columbia.edu/data/collection/gpw-v4) at a resolution of 30 arc-s grid cells, and were summarized for the period under study. The accessibility to cities, measured as the land-based travel time (minutes) to the nearest densely populated area with 1500 or more inhabitants per square kilometer, was obtained from the Malaria Atlas Project (https://malariaatlas.org/research-project/accessibility_to_cities/) with a resolution of a 30 s-arc for 2015. In addition, we used the gridded global datasets of the gross domestic product (GDP) and “Human Development Index” between 2000 and 2015 (https://datadryad.org/stash/dataset/doi:10.5061/dryad.dk1j0) to extract and calculate the zonal statistics for each regime determined in our research in regards to the average GDP per capita in a given administrative area at a 5 arc-min resolution NetCDF file. For 2016–2018, we used 2015 data, simple extrapolations, and data projections75, obtaining similar results. All features extracted from the different datasets were consolidated into global GeoTif files with a common resolution, and the perturbations of high-latitude regions were corrected using a geographic information system.

Clustering analysis

We apply a series of statistical clustering methods to unveil relevant and interpretable pyromes from our consolidated fire dataset. We test multiple preprocessing and dimensionality reduction techniques to identify meaningful clusters, not depending on a particular method or data transformation. For this, a numerical database derived from the global wildfire data between 2000 and 2018 was generated. We only included cells within the land (no water/ocean). The features (columns) corresponded to the yearly fire-related variables. Clusters were defined using the normalized yearly average values of the entire data of the frequency (number of fires per year), size (area in square kilometers covered by the fire), perimeter (km), duration (days until the fire was suppressed), daily expansion (area evolution per day), and ratio perimeter/area of the observations, classifying each pixel of the grid as a part of a unique fire regime.

Multiple clustering and unsupervised machine learning algorithms were tested and compared, including density-based spatial clustering of applications with noise (DBSCAN), ordering points to determine the clustering structure (OPTICS), K-Means, self-organizing maps, and dimensionality reduction techniques (i.e., a principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE)). Quantitative and qualitative comparisons were performed using various performance metrics, such as the intra/inter distance between groups, silhouette value76, and elbow method (comparing the sum of the squared distances from each point to its assigned center, see Supporting Information Table 17 for details). The final number of groups was obtained by performing a statistical analysis of all possible classifications, i.e., by comparing multiple subsets of features, numbers of clusters, algorithms, and expert assessments. Similar results were obtained across the different clustering algorithms even when modifying the preprocessing/dimensionality reduction methods applied to the original features before clustering the observations. In all our experiments, we observed a common convergence towards the identified 15 clusters. Furthermore, the identified clusters present a high-level of interpretability and provide us with an effective balance between computational complexity and robustness of the results. Selecting less clusters led the experiments to more variability and oscillation of the outputs/statistical analysis, potentially leading to less accurate representations of the data. Increasing their number led us towards too granular solutions, losing the global scope/view of the study.

The PCA77 algorithm was implemented to quickly visualize the multidimensional data into a two-dimensional map by selecting independent components that retained the maximum percentage of the dataset variance. In our study, we implemented it as a preprocessing step before applying the clustering algorithms, which was useful for testing the robustness of the classification, and for evaluating and visualizing the incidence of the climatic and demographic components of each pyrome and fire regime. For this step, we performed an independent PCA for each set of attributes, that is, climatic and demographic, by selecting a unique component to summarize each dimension. By applying this method, we obtained unique components explaining 51% and 59% of the variance of the features, respectively. These components provided us with potential explanations regarding the driving factors behind each fire regime, as characterized by the following weights: βPDSI = −0.135, βAVGTemp = 0.694, βTotalPrecipitation = 0.707, βGDP = −0.590, βPopD = 0.556, and βAccess = −0.584.

The t-SNE algorithm78 is a nonlinear dimensionality reduction technique used for the visualization of high-dimensional datasets. It computes a probability distribution for pairs i and j of multidimensional observations, such that similar objects lead to higher probability values (see (1), where σi is the bandwidth of a Gaussian kernel). Then, a second probability distribution is defined over all of the observations in a low-dimensional map, thereby minimizing the Kullback-Leibler divergence79 between the two calculated distributions.

$${{{{{{{\bf{P}}}}}}}}(j| i)=\frac{{e}^{(-\parallel {{{{{{{{\bf{x}}}}}}}}}_{i}-{{{{{{{{\bf{x}}}}}}}}}_{j}{\parallel }^{2}/2{\sigma }_{i}^{2})}}{{\sum }_{k\ne i}{e}^{\left(-\parallel {{{{{{{{\bf{x}}}}}}}}}_{i}-{{{{{{{{\bf{x}}}}}}}}}_{k}{\parallel }^{2}/2{\sigma }_{i}^{2}\right)}}$$
(1)

We applied this algorithm to visualize the centroids of the regimes into a two-dimensional map, providing an effective tool for evaluating the similarities/differences between clusters and for interpreting the fire regimes obtained from the multiple clustering algorithms. This analysis was supplemented by dendrograms providing a hierarchical structure of the centroids of the clusters (see Supplementary Methods).

K-means80 was used as the main unsupervised learning algorithm to classify the observations into k clusters. It seeks to minimize the within-cluster variance. We applied it to our multidimensional dataset using different preprocessing techniques (e.g., normalization, standardization, and dimensionality reduction methods). K-means models were fitted for all k [2, 35]. Performance metrics and all centroid statistics were recorded, and a series of plots were generated to visualize the core characteristics of each fire regime. From the experiments, a convergence in the performance metrics was observed with k [10, 20], and notable improvements were observed until k = 10. We determined the configurations that best balanced the trade-offs between complexity, interpretability, and reproducibility. Once a cell was classified, a label feature was generated in a GeoTif band, capable of remapping the cells into the world grid for visualization purposes and expert assessment. A Gaussian kernel was applied to smooth the results.

The self-organizing maps81 corresponded to the artificial neural network models applied in the context of the nonlinear dimensionality reduction of the high-dimensional data. Following an unsupervised learning approach, the models were trained using competitive learning, where the nodes of the network were activated (competed) to represent certain characteristics of the data. Training samples were fed into the network, and their Euclidean distance was calculated with respect to the weight vectors of the nodes, as initialized by sampling from the subspace generated by the two largest principal component eigenvectors. The weights of the neuron(s) with the closest distance (best matching units) were updated using (2) where Wu(t) is the weight of neuron u at time step t, θ(v, u, t) is the distance function between neurons v and u at step t, η(t) is the learning rate, and D(i) is the input vector of the i-th training sample.

$${W}_{u}(t+1)={W}_{u}(t)+\eta (t)\theta (v,u,t)(D(i)-{W}_{u}(t))$$
(2)

This process was repeated for each input vector for a number of training epochs or until convergence within a small δ > 0 threshold was achieved. Once the map was obtained, the components of the U-Matrix were calculated as the average distance between the vectors of the nodes’ weights and their closest neighbors (eight nodes in a rectangular grid), thus representing the high-dimensional data as a two-dimensional matrix. By applying a color gradient (e.g., blue to red), the U-matrix could be used to determine potential clusters of data where cooler sections (compact regions) were considered as clusters, and warmer areas represented the boundaries (larger distances) between these groups.

This representation could help for visualizing the clusters in high-dimensional spaces, or for recognizing them automatically using relatively simple image processing techniques. We processed the resulting U-matrix with a Laplacian of Gaussian (LoG)82 kernel, one of the most widely used and effective blob detection methods, aimed at detecting homogeneous areas within an image that are significantly different (e.g., color and brightness) from their neighborhoods. Belonging to the family of differential methods, and based on derivatives of the function with respect to the position, it consists of a sequential application of a Gaussian kernel (to smooth the image and attenuate its noise) and the Laplacian filter, emphasizing regions with rapid intensity changes. Thus, the LoG operator (3), used to convolve with the U-matrix, is obtained to determine the potential fire regime classifications.

$$LoG(x,y)=-\frac{1}{\pi {\sigma }^{4}}\left[1-\frac{{x}^{2}+{y}^{2}}{2{\sigma }^{2}}\right]{e}^{-\frac{{x}^{2}+{y}^{2}}{2{\sigma }^{2}}}$$
(3)

Spatial and temporal analyses

To spatially delineate pyromes, identify their main hot-spots (regimes) based on the density of their observations, and characterize them with temporal-dependent variables, we applied the following spatial density and temporal analysis. Gaussian kernels using a radius of 5 and bandwidth h for minimizing the mean integrated squared error ((4), which measures the difference between the original function f(x) and its kernel density estimator \({\hat{f}}_{h}(x)\)) were applied for the spatial characterization of the regimes. We determined the regions of the world with the most fire observations, based on the density of the cells in a particular regime. Contour lines were calculated for each local region (regime), accounting for 10%, 30%, 50%, 70%, and 90% of the local observations, respectively, to determine the areas of the world where the fire regimes were focused. The regions with at least 30% of the local observations were then ordered by area (largest to smallest), thereby characterizing the top five or maximum numbers with identifiable areas in terms of demographic, climatic, and soil features. Once the regions were determined, historical time series were generated for each regime for all relevant features, focusing on the inter-annual and monthly aggregated fire frequency and average fire size, so as to determine the regional seasonality and trends.

$$MISE(h)={{{{{{{\bf{E}}}}}}}}\left[\int{({\hat{f}}_{h}(x)-f(x))}^{2}dx\right]$$
(4)