Introduction

Synchronized flowering in less-seasonal forests in Southeast Asia, called general flowering, is one of the most spectacular and mysterious events that occur in tropical ecosystems. At irregular intervals of several years, diverse species, including species in the Dipterocarpaceae family, flower heavily1,2,3,4. Synchronous flowering and fruiting sometimes occur on a wide geographic scale (c. 10–10 km2)5,6,7,8, together with flowering episodes occurring at other times at relatively small spatial scales2,9. A number of proximate cues for general flowering have been proposed, including drought10,11,12,13 associated with the El Niño Southern Oscillation6,11,14, cloud-free conditions, and high solar radiation15,16,17, and a night-time drop in the minimum temperature2,5. In addition, stored nutrients, especially phosphorus, have been implicated as an endogenous factor regulating flowering in tropical rainforests limited by phosphorus9,18,19,20. However, general flowering with intervals longer than 2 years is likely to be caused by external environmental factors rather than endogenous factors because recovery from nutrient shortages after heavy fruiting occurs relatively quickly within 1–2 years19. Recent studies have demonstrated that the synergism between cool temperatures and drought is the major trigger for floral induction in dipterocarp trees19,21,22.

Global climate change brings elevated temperatures and more variable rainfall to Southeast Asia23. Projecting future phenological changes is an urgent task for the management and conservation of Southeast Asian rainforests because general flowering plays an important role in their successful regeneration and restoration24,25. However, the impact of climate change on the reproductive phenology of tropical rainforests is poorly understood26,27,28. This is partly owing to the lack of long-term phenological data and the absence of predictive models that capture the mechanistic relationships between climatic factors and reproductive phenology in the tropics26,27,28.

To assess the past and future tropical phenology in Southeast Asia, we analyzed historical records of reproductive phenology and meteorological data over 35 years in Peninsular Malaysia and predicted what will happen regarding community-wide flowering events in the future.

Results

Proportion of flowering and fruiting species decreased from the mid-1970s to the early 2000s

Our reproductive phenology data were collected from Bulletin Fenologi Biji Benih dan Anak Benih (Bulletin of Seed and Seedling Phenology), which was deposited at the library of the Forest Research Institute Malaysia (FRIM) located ~12 km northwest of Kuala Lumpur, Malaysia. Phenology monitoring was conducted based on monthly observations of the presence of flowers and fruits of tree species growing in the arboretum of FRIM from April 1976 to September 2010. After excluding species that did not satisfy the five required criteria for data accuracy, our phenology data included 210 species from 41 families (Supplementary Data 1). Dipterocarpaceae was the most abundant family (45% of total species), followed by Malvaceae (7.6%) and Leguminosae (6.2%) (Fig. 1a). These long-term phenology data from >200 species exposed to the same environment at the arboretum provide an excellent opportunity to compare phenological responses to climate change among species.

The fractions of flowering and fruiting species fluctuated heavily between years. The fraction of fruiting species was highly correlated with the fraction of flowering species with a time lag of 2 months after flowering (the time-lagged cross-correlation = 0.77). The greatest number of flowering and fruiting events occurred in 1985, in which >35% of monitored species participated in flowering and fruiting (Fig. 1b, c). Large flowering events with the flowering of >20% of monitored species occurred six times over 35 studied years, and these flowering events were followed by mass fruiting events (Fig. 1b, c, Supplementary Fig. 1a). These six large reproductive events at FRIM were synchronized with general flowering events monitored in natural forests in Peninsular Malaysia8,9,23,24, suggesting that the flowering and fruiting patterns between the arboretum at FRIM and natural forests were similar. The levels of between-species synchrony in flowering and fruiting events were significantly higher in Dipterocarpaceae species than in non-Dipterocarpaceae species (P < 0.0001, two-way analysis of variance, n = 4465–6555; Supplementary Fig. 1b). Moreover, the coefficients of variation in the proportion of flowering and fruiting species in Dipterocarpaceae were twice as large as those in non-Dipterocarpaceae species (1.787 and 1.583 for Dipterocarpaceae and 0.803 and 0.753 for non-Dipterocarpaceae, respectively). These results indicate that Dipterocarpaceae has a pivotal role in general flowering.

We detected decreasing trends in the proportions of flowering (P = 0.0021, Mann–Kendall (MK) test, two-sided, n = 400; Supplementary Table 1) and fruiting species (P < 0.0001, MK test, two-sided, n = 400; Supplementary Table 1) from the mid-1970s to the early 2000s. In contrast, temperature and precipitation during this period revealed an increasing trend at mean rates of 0.39 ± 0.02 °C (P < 0.0001, MK test, two-sided, n = 10,330; Fig. 1d) and 0.51 ± 0.26 mm/day per decade (P = 0.027, MK test, two-sided, n = 12,668; Fig. 1e), respectively. The monthly frequency of flowering and fruiting varied largely among species (Fig. 2a, b); 17% of species flowered at least once per year, whereas 25% of species flowered only once every 10 years (Supplementary Data 2). Regardless of this large variation in the frequency of reproductive events across species, most species exhibited clear reproductive seasonality. At the community level, two flowering peaks occurred in April and October (Fig. 2a), followed by fruiting peaks in June and December, respectively (Fig. 2b). The seed dispersal time, which was calculated as the month when all mature fruits dropped from their mother tree, peaked in February or August, two months after the fruiting peaks (Fig. 2c). The timing of the two seed dispersal peaks matched the phases in which temperatures and rainfall started to increase (Fig. 2d, e), which was consistent with the finding that seed dispersal is timed to match the favorable humidity condition for the survival of offspring29. Among the nine families containing at least five species, only Moraceae, which includes the genus Ficus, produced flowers and fruits almost year-round without any seasonality (Fig. 3). Other families show synchronized flowering phenology in both spring and autumn (e.g., Dipterocarpaceae; Fig. 3) or spring-flowering dominance (e.g., Anacardiaceae, Lauraceae, and Meliaceae; Fig. 3).

Drought and cool temperature signals can explain general flowering in Dipterocarpaceae

The decreased proportions of flowering and fruiting species observed in the past can be explained by increased temperatures or decreased drought event frequencies because cool temperatures and drought have been suggested to be major environmental drivers of general flowering19,21,22. To investigate the relationships between flowering and temperature and between flowering and precipitation, we adopted a model that was developed to predict flowering phenology in Dipterocarpaceae21 and was further extended to predict the flowering phenology of tropical plants on Barro Colorado Island, Panama30. The previous model was successful in explaining the weekly flowering phenology of five dipterocarp species over 10 years in the Pasoh forest reserve in Peninsular Malaysia; thus, it is a reliable model for forecasting future flowering phenology in Dipterocarpaceae. The model assumed that potential environmental cues to floral induction, the cool unit (CU) and drought unit (DU), accumulate for n1 days prior to the onset of floral induction, and flowers then develop for n2 days before opening (Supplementary Fig. 2). The DU model evaluates whether drought alone cues flowering (drought-induced flowering). The CU × DU model evaluates whether cold and drought have a multiplicative effect on flowering (low temperature- and drought-induced flowering). Logistic regression was performed using only DU and using CU × DU as the explanatory variables and using the presence or absence of a first flowering event as the dependent variable. We examined these two models because dipterocarp flowering has been predicted successfully by both of these models in previous studies21,29.

To perform model fitting, we focused only on Dipterocarpaceae because of the low percentage of missing values (4.81%) compared with those of non-Dipterocarpaceae species (16.8% on average; Supplementary Data 1). Because some species have similar flowering phenology, we first performed time-series clustering using 95 dipterocarp species based on the similarity of their flowering phenology (Fig. 4a) and then carried out the forward selection of the optimal number of phenological clusters based on the minimization of the Akaike information criterion (AIC) (Supplementary Fig. 3). We used data from May 1976 to March 1996 to train the model and data from July 1997 to April 2005 to validate the model. We chose these periods for the model training and validation because there was a blank period in the data set from April 1996 to June 1997 in which climate data were missing.

The model-fitting results revealed that the optimal number of phenological clusters was 10 (seven clusters and three independent species; Supplementary Data 3). After removing independent species and clusters with fewer than five species (due to their small sample sizes), six clusters remained (Fig. 4a; Supplementary Data 4). The CU × DU model was selected to explain the flowering phenology of clusters 3 and 4, two major clusters including 27 and 28 species, respectively (Fig. 4b; Supplementary Table 2). The flowering phenology of other clusters was explained by drought only (Fig. 4b; Supplementary Table 2). The area under the ROC (receiver operating characteristic) curve (AUC) values ranged from 0.64–0.78 for the training data and 0.62–0.79 for the validation data (Supplementary Table 2), suggesting an acceptable discrimination ability of the model for most clusters31. Because cluster 1 showed low AUC values (0.64 for the model training and 0.62 for the model validation), predictions of this cluster must be approached with caution (Supplementary Table 2).

Projections of 21st-century changes in flowering phenology

We predicted the future flowering phenology based on the model for each of six phenological clusters under two climate scenarios, Representative Concentration Pathways (RCP) 2.6 with limited CO2 emissions and RCP8.5 with high CO2 emissions, which were simulated using three general circulation models (GCMs; GFDL–ESM2M; Fig. 5a, b, IPSL–CM5A-LR and MIROC5; Supplementary Fig. 4). Compared to the 1976–2005 period, the minimum temperature was predicted to increase by 1.2 ± 1.1 °C under the RCP2.6 scenario and by 3.1 ± 1.7 °C under the RCP8.5 scenario by 2099. Under the RCP2.6 scenario, the mean predicted flowering probabilities during the 2050–2099 period across the three models decreased in clusters 3 and 4 to 57% and 49% of the predicted flowering probabilities during the 1976–1996 period, respectively (Fig. 5c). Under the RCP8.5 scenario, the mean predicted flowering probabilities in clusters 3 and 4 were further reduced to 37% and 28% during the 2050–2099 period, respectively (Fig. 5c). The decreased flowering probabilities in these two phenological clusters were caused by lower occurrences of low-temperature flowering cues in the future (Supplementary Fig. 5). Under the RCP8.5 scenario, low-temperature signals that trigger flowering in the species included in phenological clusters 3 and 4 rarely occurred or completely disappeared (Supplementary Fig. 5). In contrast, in species included in other clusters that were sensitive only to drought cues for flowering, the mean flowering probabilities were unchanged between the 1976–1996 and 2050–2099 periods (Fig. 5c) because the drought-related flowering cues were rather stable throughout the simulations (Supplementary Fig. 5).

To confirm our predictions, we extended our phenology forecasting to three other regions in Southeast Asia, Trang Province in Thailand32, Lambir Hills National Park in the island of Borneo12, and central Kalimantan in Indonesia13, where long-term phenology monitoring plots exist (Fig. 6a). Decreased flowering probabilities were predicted only in clusters 3 and 4 in all regions, while the flowering probabilities of other species were predicted to be robust (Supplementary Fig. 6), confirming the predictions obtained in FRIM. A comparison of seasonal flowering patterns along a latitudinal gradient based on historical climate data simulated during the 1976–1996 period from GCMs revealed shifts from a unimodal flowering peak in spring (March in the northern hemisphere) in Trang Province, bimodal, or weak flowering peaks in spring and fall in FRIM and Lambir National Park and a pronounced flowering peak only in spring (September in the southern hemisphere) in central Kalimantan (Fig. 6b). The predicted seasonal flowering patterns are consistent with the previous observations12,13,32, showing that using low temperatures and drought to forecast phenology can be applicable to wide regions in Southeast Asia. The latitudinal gradient of seasonal flowering patterns was predicted to be robust to climate change in the 21st century (Fig. 6c), suggesting that the different seasonal distribution of flowering probability among the four regions can be explained by the differential seasonal rainfall patterns across the regions (Supplementary Fig. 7). These results suggest that the phenological responses of tropical trees in Southeast Asia to climate change are not qualitative but are rather quantitative.

Discussion

Phenological shifts are among the most widely studied biotic responses to climate change33. Most phenological shift observations come from temperate and boreal biomes where advancing biological springs and delayed biological winter arrivals are documented in response to increased temperatures34,35. Tropical species have been suggested to be more sensitive to climatic changes than temperate and boreal species because they have evolved in areas with less seasonal environmental variation36,37. However, no studies have addressed the vulnerability of tropical species under future climate change. Our projections of 21st-century changes in the flowering phenology of tropical plant species revealed that a 1.2 °C increase in temperature under the RCP2.6 scenario resulted in an ~50% decrease in the future flowering probabilities of 57% of dipterocarp species that are sensitive to low-temperature flowering cues. In a temperate perennial herb that requires winter cold for floral initiation, a significantly decreased flowering probability was predicted when the temperature increased by 4.5 °C38. These results suggest that tropical species might be more sensitive and vulnerable to climate change than species located in temperate ecosystems.

Our results also highlight the variable features of phenological changes among species in response to climate change. Forty-three percent of dipterocarp species are predicted to be sensitive only to drought for flowering, and their reproductive activities are robust to climate change. Different phenological responses across species would alter forest regeneration and, eventually, the plant species composition in the future. Regardless of the significant effect of climate change on the flowering probability at the quantitative level, the seasonal distribution of flowering probability was found to be robust in wide regions of Southeast Asia. This result represents another interesting difference between tropical and temperate plant species.

Although the data presented here are one of the longest records of reproductive phenology in tropical ecosystems39, we still need to be cautious when interpreting these results because there is room to extend our analyses from the phenological cluster level to the species level when longer-term and higher-temporal-resolution data become available. With upgraded phenological data, the estimation accuracy of the environmental drivers of tropical phenology and predictive power will be improved. Because observations of reproductive phenology in tropical plants are still rare due to a paucity of long-term studies26,27,28, continued phenology monitoring is necessary39.

There have been only a few reports on past phenological changes in tropical plants. In Kibale National Park, Uganda, fruiting phenology from two data sets (covering 1970–1983 and 1990–2002) revealed that the proportion of individual fruiting was negatively correlated with the minimum temperature and that increased rainfall was associated with the complete absence of fruiting in common tree species40. In Ranomafana National Park, Madagascar, 12 years of fruiting phenology observations showed a correlation between increased rainfall and an increase in fruiting in some tree species41. These studies highlight the complex and variable features of phenological changes among tropical plant species in response to climate change37. An improved mechanistic understanding of the environmental drivers of reproductive phenology in diverse species in different tropical ecosystems will unravel the complex nature of phenological responses in the tropics and will allow the extension of future reproductive phenology projections from regional to global scales42.

The rapid global warming that occurred over the last millennium was unprecedented43. Our results suggest that plastic responses to climate change at the individual level may not be high for the tropical species studied herein. Moreover, species with long generation times are the least likely to be rescued by evolution alone44. Early detections of biotic change signatures and predictions of the magnitude and direction of changes in plant reproductive phenology will benefit management programs and aid in the sustainable future of the most diverse ecosystems on Earth.

Methods

Data collection of flowering and fruiting phenology

Monthly reproductive phenology data recorded over 35 years (from April 1976 to September 2010) were collected from the Bulletin Fenologi Biji Benih dan Anak Benih (Bulletin of Seed and Seedling Phenology), which was deposited at the FRIM library. The bulletin reported seed and seedling availabilities and the flowering and fruiting phenology of trees at several research stations in Malaysia. The present study collected flowering and fruiting records of trees grown in FRIM arboretums located approximately 12 km northwest of Kuala Lumpur, Malaysia (latitude 3°24 ‘N, longitude 101°63 ‘E, elevation 80 m). There are both dipterocarp and non-dipterocarp arboretums in FRIM, both of which were founded in 1929. These arboretums preserve and maintain living trees for research and other purposes. Each month, three research staff members of FRIM with sufficient phenology monitoring training made observations with binoculars to record the presence of flowers and fruits on trees of each species on the forest floor from April 1976 to September 2010. The phenological status of the trees was recorded as flowering during the developmental stages from flower budding to blooming and as fruiting during the developmental stages from the occurrence of immature fruit to fruit ripening. Because only one or two individuals per species are grown at the FRIM arboretums, the flowering and fruiting phenology were monitored using these individuals. The resultant flowering and fruiting phenology data included a time series of binary data (1 for presence and 0 for absence) with a length of 417 months.

The original data included 112 dipterocarp and 240 non-dipterocarp species. We excluded 17 dipterocarps and 125 non-dipterocarp species based on the following five criteria for data accuracy.

1. 1.

Percentage of missing values is ≤50%: If the monthly flowering or fruiting phenology data of a species included a substantially large number of missing values (>50%), the species was excluded.

2. 2.

Stable flowering period: We considered an observation to be unreliable if the flowering period was significantly different among flowering events (if the coefficient of variation in the flowering period was larger or equal to 1.0).

3. 3.

Flowering period is shorter than or equal to 12 months: we considered an observation to be unreliable if the flowering period was longer than 12 months because it was unlikely that the same tree would flower continuously for longer than 1 year.

4. 4.

The flowering and fruiting frequencies were not significantly different between the first and second half of the census period: when the flowering frequency was zero for the first half of the observation period but was larger than 0.1 for the second half of the observation period, or when the flowering frequency was zero for the second half of the observation period but was larger than 0.1 for the first half of the observation period, we removed these species because data are not reliable (e.g., physiological conditions may have changed significantly). We adopted the same criteria for the fruiting phenology data.

5. 5.

We removed overlapping species, herb species, and specimens with unknown species names.

After removing unreliable species based on the five criteria explained above, we obtained 95 dipterocarp and 115 non-dipterocarp species (Supplementary Data 1). We used these species for further analyses. It is unlikely that our final data includes trees that were replaced by young trees during the census period because newly planted seedlings do not flower over 20–30 years until they are fully grown to the reproductive stage (>20–30 cm DBH)45.

Detection of seasonality in reproductive phenology

To compare the flowering and fruiting phenology seasonality among different families, nine families that included at least five species were used. The number of flowering or fruiting events was counted for each month from January to December during a census, and then the frequency distribution was drawn as a histogram. Similarly, we also generated a histogram for the seed dispersal month, which was calculated as the month when fruiting ended (i.e., when the binary fruiting phenology data changed from one to zero).

Classification of phenological patterns

To classify the phenological patterns, we performed time-series clustering using the R package TSclust46 with the hierarchical clustering method based on the Dynamic Time Warping distance of the flowering phenology data of each species. For this analysis, time points at which there were missing values for at least one species were excluded. Because of the large number of missing values in non-Dipterocarpaceae species, we performed time-series clustering only for the Dipterocarpaceae species based on 394 time points in total. The number of phenological clusters was estimated based on AIC, as explained below.

Climate data

Daily minimum, mean, and maximum temperatures and precipitation data monitored at the FRIM KEPONG (3° 14’ N, 101° 42’ E, elevation 97 m) weather station were provided by the Malaysian Meteorological Department. We used the daily minimum temperature for our analysis because there were fewer missing values compared to the numbers of missing daily mean and daily maximum temperature values. The periods in which climate data were available were from 1 March 1973 to 31 March 1996, and from 23 July 1997 to 20 April 2005. We removed periods in which there were missing values spanning longer than 5 days. When the range of missing values spanned a period shorter than 3 days, we approximated these missing values using the mean minimum temperatures recorded on the adjacent three days. Although solar radiation data were not available for our study, the use of precipitation is sufficient for model fitting because there is a significant negative correlation between solar radiation and precipitation in Southeast Asia47.

Climate data generated by GCMs

As the future climate inputs, we used bias-corrected climate input data from 1 January 2050 to 31 December 2099, with a daily temporal resolution and a 0.5° spatial resolution, provided by the ISI-MIP project48; these data are based on the Coupled Model Intercomparison Project Phase 5 outputs from three GCMs: GFDL–ESM2M, IPSL–CM5A-LR, and MIROC5. To compare the flowering phenology between 1976–1996 and 2050–2099, bias-corrected GCM data from 1 May 1976 to 31 March 1996, were also used. This period (1 May 1976–31 March 1996) is consistent with the period used for model fitting. We selected daily minimum temperature and precipitation time series from the 0.5° grid cells corresponding to the study site for phenology monitoring at FRIM. To compare flowering phenology among regions, we also used the same set of data from three other regions in Southeast Asia: Trang Province in Thailand (7° 4’ N, 99° 47’ E), Lambir Hills National Park in Malaysia (4° 2’ N, 113° 50’ E), and central Kalimantan in Indonesia (0° 06’ S, 114° 0’ E). Because the study site in FRIM was not in the center of a 0.5° grid cell, we interpolated the data using four grid cells in the vicinity of the observation site. We used the weighted average according to the distance between each observation site and the center of each corresponding grid cell.

Although the climate input data provided by ISI-MIP were already bias-corrected, we conducted additional bias correction at FRIM using a historical scenario for each GCM data set and the observed weather data from 1 January 1976 to 31 December 2004 based on previously presented protocol49. We did not implement any bias correction for the frequency of dry days or precipitation intensity of wet days49 because we only focused on the average precipitation.

The variances in the annual fluctuation of the monthly mean precipitation were not the same between the observation data and historical GCM runs at FRIM. For all three GCMs (GFDL–ESM2M, IPSL–CM5A-LR, and MIROC5), the variances in the yearly fluctuation output by the GCMs tended to be larger than that of the observed data at the FRIM KEPONG weather station during winter and spring. On the other hand, during summer and fall, the variances output by the GCMs tended to be smaller than that of the observed data. These biases could not be corrected using the previous method49. Therefore, we conducted the following bias correction for these data:

$${p}_{i,m,y}^{{{{{{\rm{GCM}}}}}}* }={r}_{i,m,y}^{{{{{{\rm{GCM}}}}}}}\cdot \left[{F}_{\Gamma }^{-1}\left({F}_{\Gamma }\left({\delta }_{m,y}^{{{{{{\rm{GCM}}}}}}}|{k}_{m,y},{\theta }_{m,y}\right)|{k}_{m,y}^{* },{\theta }_{m,y}^{* }\right)\cdot {\rho }_{m,y}^{{{{{{\rm{GCM}}}}}}}\right],$$
(1)

where $${p}_{i,m,y}^{{{{{{\rm{GCM}}}}}}* }$$ is the bias-corrected precipitation value of the target GCM at year y, month m, and date i. In the equation, $${r}_{i,m,y}^{{{{{{\rm{GCM}}}}}}}$$ is the ratio of the precipitation value of the GCM relative to the monthly mean value. Then, the following equation is used:

$${r}_{i,m,y}^{{{{{{\rm{GCM}}}}}}}=\frac{{p}_{i,m,y}^{{{{{{\rm{GCM}}}}}}}}{{\bar{p}}_{m,y}^{{{{{{\rm{GCM}}}}}}}},$$
(2)

where $${p}_{i,m,y}^{{{{{{\rm{GCM}}}}}}}$$ is the precipitation value (not bias-corrected) of the GCM at year $$y$$, month $$m$$, and date i and $${\bar{p}}_{m,y}^{{{{{{\rm{GCM}}}}}}}$$ is the monthly mean precipitation value of the GCM at year $$y$$ and month $$m$$. In Eq. 1, $${F}_{\Gamma }$$ represents the cumulative distribution function of a gamma distribution, $${F}_{\Gamma }^{-1}$$ represents the inverse function of the cumulative distribution function of the gamma distribution, and $${k}_{m,y}$$ and $${\theta }_{m,y}$$ are the shape parameters. In Eq. 1, $${\delta }_{m,y}^{{{{{{\rm{GCM}}}}}}}$$ indicates the deviation of the monthly mean from the normal climate value of the corresponding period, and this value is calculated as follows:

$${\delta }_{m,y}^{{{{{{\rm{GCM}}}}}}}=\frac{{\bar{p}}_{m,y}^{{{{{{\rm{GCM}}}}}}}}{{\rho }_{m,y}^{{{{{{\rm{GCM}}}}}}}},$$
(3)

where $${\rho }_{m,y}^{{{{{{\rm{GCM}}}}}}}$$ is the normal climate value during the target period. In this method, we defined the normal climate value as the mean of the monthly mean precipitation values over 31 years.

$${\rho }_{m,y}^{{{{{{\rm{GCM}}}}}}}=\frac{1}{31}\mathop{\sum }\limits_{j=y-15}^{y+15}{\bar{p}}_{m,j}^{{{{{{\rm{GCM}}}}}}}.$$
(4)

When the mean of a gamma distribution is fixed at one, the shape parameters are represented as follows:

$${k}_{m,y}=\frac{1}{V\left({\delta }_{m,y}^{{{{{{\rm{GCM}}}}}}}\right)},$$
(5)
$${\theta }_{m,y}=\frac{1}{{k}_{m,y}},$$
(6)

where $$V\left({\delta }_{m,y}^{{{{{{\rm{GCM}}}}}}}\right)$$ indicates the variance in $${\delta }_{m,y}^{{{{{{\rm{GCM}}}}}}}$$ at month $$m$$ over 31 years.

In this method, we assumed that the $${\delta }_{m,y}^{{{{{{\rm{GCM}}}}}}}$$ value follows a gamma distribution and that the ratio of the variance of $${\delta }_{m,y}^{{{{{{\rm{GCM}}}}}}}$$ to the variance of $${\delta }_{m,y}^{{{{{{\rm{obs}}}}}}}$$ is maintained even in the future scenario. Here, $${\delta }_{m,y}^{{{{{{\rm{obs}}}}}}}$$ represents the deviation of the monthly mean in the observation data from the normal climate value.

$${\delta }_{m,y}^{{{{{{\rm{obs}}}}}}}=\frac{{\bar{p}}_{m,y}^{{{{{{\rm{obs}}}}}}}}{{\rho }_{m}^{{{{{{\rm{obs}}}}}}}},$$
(7)
$${\rho }_{m}^{{{{{{\rm{obs}}}}}}}=\frac{1}{28}\mathop{\sum }\limits_{j=1976}^{2004}{\bar{p}}_{m,y}^{{{{{{\rm{obs}}}}}}}.$$
(8)

In the above equations, $${\bar{p}}_{m,y}^{{{{{{\rm{obs}}}}}}}$$ indicates the monthly mean precipitation value in the observed data. As mentioned above, because we assume that the ratio of the variance in $${\delta }_{m,y}^{{{{{{\rm{GCM}}}}}}}$$ to the variance in $${\delta }_{m,y}^{{{{{{\rm{obs}}}}}}}$$ is maintained, $${k}_{m,y}^{* }$$ and $${\theta }_{m,y}^{* }$$ are calculated as follows:

$${k}_{m,y}^{* }=\frac{{k}_{m,y}}{\alpha },$$
(9)
$${\theta }_{m,y}^{* }=\frac{1}{{k}_{m,y}^{* }},$$
(10)

where

$$\alpha =\frac{V\left({\delta }_{m,y}^{{{{{{{\rm{GCM}}}}}}}^{{{{{{\rm{h}}}}}}}}\right)}{V\left({\delta }_{m,y}^{{{{{{\rm{obs}}}}}}}\right)}.$$
(11)

In Eq. 11, $${\delta }_{m,y}^{{{{{{{\rm{GCM}}}}}}}^{{{{{{\rm{h}}}}}}}}$$ is the deviation of the monthly mean of the historical GCM precipitation data from the normal climate value. Here, we defined the normal climate value as the average monthly mean during 1976–2004.

The method proposed here is an original bias correction method, but the above equations are easily derived if we assume that the $${\delta }_{m,y}^{{{{{{\rm{GCM}}}}}}}$$ value follows a gamma distribution and that the ratio of the variance in $${\delta }_{m,y}^{{{{{{\rm{GCM}}}}}}}$$ to the variance in $${\delta }_{m,y}^{{{{{{\rm{obs}}}}}}}$$ is maintained even in the future scenario. Notably, because we combined this method with the bias correction method described previously49, Eq. 2 should be expressed as follows:

$${r}_{i,m,y}^{{{{{{\rm{GCM}}}}}}}=\frac{{\widetilde{p}}_{l,m,y}^{{{{{{\rm{GCM}}}}}}}}{{\bar{p}}_{m,y}^{{{{{{\rm{GCM}}}}}}}},$$
(12)

where $${\widetilde{p}}_{l,m,y}^{{{{{{\rm{GCM}}}}}}}$$ is the precipitation data that are bias-corrected using the method described previously49. Bias-corrected data were compared with the data without bias correction (Supplementary Figs. 811).

Statistical analyses and reproducibility

We adopted previously presented models in which environmental triggers for floral induction accumulate for n1 days prior to the onset of floral induction21 (Supplementary Fig. 2). Flowers then develop for n2 days before opening (Supplementary Fig. 2). The model assumption of the time lag between floral induction and anthesis, which is denoted as n2, was validated by a previous finding in which the expression peaks of flowering-time genes, which are used as molecular markers of floral induction, were shown to occur at least one month before anthesis in Shorea curtisii19. S. curtissi is included in our data set. The CU at time t, $${{{{{\rm{CU}}}}}}\left(t|{\theta }^{C}\right)$$, is calculated as follows:

$${{{{{\rm{CU}}}}}}\left(t|{\theta }^{C}\right)=\mathop{\sum }\limits_{n={n}_{2}}^{{n}_{2}+{n}_{1}-1}{{{{{\rm{max }}}}}}\{\bar{C}-x\left(t-n\right),0\},$$
(13)

where $${\theta }^{C}=\left\{{n}_{1},{n}_{2},\bar{C}\right\}$$ is the set of parameters and x(t) is the temperature at time t. Here, $$\bar{C}$$ indicates the threshold temperature. The term max{x1, x2} is a function that returns a larger value for the two arguments. Similarly, given $${\theta }^{D}=\{{n}_{1},{n}_{2},\bar{D}\},$$ the DU at time t, $${{{{{\rm{DU}}}}}}\left(t|{\theta }^{D}\right)$$, is defined as the difference between the mean daily accumulation of rainfall over n1 days and a threshold rainfall level ($$\bar{D}$$):

$${{{{{\rm{DU}}}}}}\left(t|{\theta }^{D}\right)={{{{{\rm{max }}}}}}\left\{\bar{D}-\mathop{\sum }\limits_{n={n}_{2}}^{{n}_{2}+{n}_{1}-1}y\left(t-n\right)/{n}_{1},0\right\},$$
(14)

where y(t) is the rainfall value at time t. The term max{x1, x2} is defined similarly as in Eq. 13.

Logistic regression was performed using only the DU and using the product of CU and DU (CU × DU) as the explanatory variables and using the presence or absence of a first flowering event as the dependent variable for each phenological cluster. Because the number of phenological clusters is unknown, we performed forward selection on the cluster number based on the AIC. Let m be the number of phenological clusters based on the dendrogram drawn from the time-series clustering explained above (Supplementary Fig. 5). Given m phenological clusters, let $${G}_{k}^{m}$$ be the kth set of clusters in which the DU model is adopted for model fitting. Here, $${G}_{k}^{m}$$ indicates the set of cluster IDs, and k ranges from 0 to m(m+1)/2. For example, when m = 2 (i.e., there are two clusters, clusters 1 and 2), there are four cluster sets, calculated as follows:

$${G}_{0}^{m=2}=\{\},{G}_{1}^{m=2}=\{1\},{G}_{2}^{m=2}=\{2\},{G}_{3}^{m=2}=\{1,2\},$$
(15)

where the element in the bracket indicates the ID of the cluster in which the DU model is adopted for model fitting. When k = 0, the DU model is not used; instead, the CU × DU model is adopted for model fitting for both clusters 1 and 2. Let i be the ith element of the vector E, which is defined as follows:

$${{{{{\bf{E}}}}}}=\{{t}_{1}^{1},\,{t}_{2}^{1},...\,,\,{t}_{n}^{1}\,,...\,,\,{t}_{1}^{m},\,{t}_{2}^{m},...,\,{t}_{n}^{m}\},$$
(16)

where n is the length of the time-series data for each cluster. Notably, n = 223 is the same for all species and clusters. The term $${t}_{1}^{m}$$ in the above equation denotes the first time point of the time series of length n for the species included in cluster m. Given m and k, let $${p}^{(m,k)}(i)$$ be the flowering probability of element i of vector E. The term $${p}^{(m,k)}(i)$$ is expressed as follows:

$${{\log }}\left[\frac{{p}^{\left(m,k\right)}\left(i\right)}{1-{p}^{\left(m,k\right)}\left(i\right)}\right]= \mathop{\sum }\limits_{j=1}^{m}{\alpha }_{m,j}\cdot {Z}_{m,j}\left(i\right)+\mathop{\sum }\limits_{j\in {G}_{k}^{\left(m\right)}}^{m}{\beta }_{m,j}\cdot {Z}_{m,j}\left(i\right)\cdot {{{{{{\rm{DU}}}}}}}_{m,j}\left(i|{\theta }_{j}^{D}\right)\\ +\mathop{\sum }\limits_{j\notin {G}_{k}^{\left(m\right)}}^{m}{\beta }_{m,j}\cdot {Z}_{m,j}\left(i\right)\cdot {{{{{\rm{CU}}}}}}\left(i|{\theta }_{j}^{C}\right)\times {{{{{{\rm{DU}}}}}}}_{m,j}\left(i|{\theta }_{j}^{D}\right),$$
(17)

where $${Z}_{m,j}(i)$$ is the dummy variable indicating a cluster for i; $${Z}_{m,j}(i)$$ equals 1 if the ith element of E belongs to the jth cluster, otherwise it is zero, and $${\alpha }_{m,j}$$ and $${\beta }_{m,j}$$ in Eq. (5) are regression coefficients for the jth cluster when the species are grouped into m clusters. We estimate the parameters and the number of clusters based on a finite number of observations. Given the number of clusters m, for each of m clusters, the parameters were estimated by maximizing the loglikelihood value calculated for all combinations of potential parameter values for $${n}_{1},{n}_{2},\bar{C},$$ and $$\bar{D}$$ within the ranges of [1 (min), 50 (max)] for n1, [1,50] for n2, [19,25] for $$\bar{C}$$, and [1,9] for $$\bar{D}$$. We varied the days (n1 and n2) by integers, temperature ($$\bar{C}$$) by tenths of a degree C, and daily precipitation ($$\bar{D}$$) by tenths of a mm. Regression coefficients ($${\alpha }_{m,j}$$, $${\beta }_{m,j}$$) for all j values under a given m value and associated likelihoods were determined using generalized linear models with binomial error structures.

With the results of the parameter estimations, we determined the number of clusters in two steps. For the first step, for a given m, we obtained $$\hat{k}(m)$$ according to the following equation:

$$\hat{k}(m)={\arg }\mathop{{\min }}\limits_{k}\{{{{{{\rm{AIC}}}}}}\{m,k(m)\},\,k(m)\,=\,0,\,...,{2}^{m}\}.$$
(18)

For the second step, with the results of $$\hat{k}$$ obtained from the first step, we obtained the estimate of the number of clusters according to forward selection by searching for the $$\hat{m}$$ value that satisfies the following inequalities:

$${{{{{\rm{AIC}}}}}}(\hat{m},\,\hat{k}(\hat{m}))\, < \,{{{{{\rm{AIC}}}}}}(\hat{m}+1,\,\hat{k}(\hat{m}+1))\cap {{{{{\rm{AIC}}}}}}(\hat{m},\,\hat{k}(\hat{m}))\, < \,{{{{{\rm{AIC}}}}}}(\hat{m}-1,\,\hat{k}(\hat{m}-1)).$$
(19)

For model fitting, the first flowering month was extracted from the flowering phenology data. When flowering lasted more than 1 month, the month after the first flowering month was replaced by a value of zero (absence of flowering). If the month before the first flowering month was a missing value, the first flowering month was treated as a missing value and was not used for further analyses. We assumed that phenology monitoring was performed on the first date of each month.

Projections of 21st-century changes in flowering phenology

We used two scenarios (RCP2.6 and RCP8.5) to forecast future reproductive phenology in dipterocarp species for each of the three GCMs (GFDL–ESM2M, IPSL–CM5A-LR, and MIROC5). We predicted the flowering probability per month for each phenological cluster during the periods from 1 May 1976–31 March 1996 and from 1 January 2050–31 December 2099 based on the best model (Supplementary Table 2). The predicted flowering probability during the 2050–2099 period was normalized to that during the 1976–1996 period for each climate scenario and for each of three GCMs. To compare the seasonal patterns between 1976–1996 and 2050–2099, the predicted flowering probability was averaged for each month from January to December and plotted for each month in Fig. 6. R version 3.6.3 was used for all analyses.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.