Seasonal climate forecast can inform the European agricultural sector well in advance of harvesting

Seasonal climate forecasts are a key component of sectoral climate services. Skill and reliability in predicting agro-climate indicators, co-designed with and for European wheat farmers, are here assessed. The main findings show how seasonal climate forecast provides useful information for decision-making processes in the European winter wheat-producing sector. Flowering time can be reliably predicted already at the beginning of the growing season in central and eastern Europe, thus supporting effective variety selection and timely planning of agro-management practices. The predictability of climate events relevant for winter wheat production is strongly dependent on the forecast initialization time as well as the nature of the event being predicted. Overall, regionally skillful and reliable predictions of drought events during the sensitive periods of wheat flowering and grain filling can be made already at the end of winter. On the contrary, predicting excessive wetness seems to be very challenging as no or very limited skill is estimated during the entire wheat growing season. Other approaches, e.g., linked to the use of large-scale atmospheric patterns, should be identified to enhance the predictability of those harmful events.


INTRODUCTION
Agricultural production has been increasingly exposed to unfavorable climate events and extremes in the last decades 1,2 . Examples of such extremes are heat stress, drought, and excessive rainfall 3 . These events can lead to heavy reductions in, and even failures of, crop yield quantity and quality 4 , with potential regional-to-global consequences in the agricultural markets and trade patterns 5 . Climate change is projected to further exacerbate this tendency.
The resilience of the agricultural sector can be enhanced by developing and implementing dedicated sectoral climate services to reduce risks but also seize the opportunities given by tailored climate information. Seasonal climate predictions, with their lead time up to 1 year ahead, offer a great opportunity to inform and support farmers in their agro-management actions, e.g. on: planning of sowing, selection of optimal crop variety, planning of fertilization and field interventions, disease treatment, and irrigation water use 6 . Seasonal climate predictions can also provide valuable information to decision makers in relation to market, trade patterns, development policies, and humanitarian assistance linked to food availability and security 7 .
Seasonal climate predictions have been already shown to bring added value for agricultural decision making in several regions of the world [8][9][10][11] . Despite the potential farmers' benefits of seasonal forecasts 12 , they have been faced with many challenges to adequately respond to end-users expectation, associated with lower prediction skill in several key regions (such as Europe), and the dependency between the skillful forecast time and the spatial scale of relevant climate events 13 . The interpretation of probabilistic forecast uncertainty is also an important, yet still challenging, aspect to be considered for an effective integration into sectoral decision-making processes 14 .
The assessment of seasonal prediction skill and reliability is a necessary step in the development of informative agro-climate services. Usually, as a first step, the assessment is done for the essential climate variables, being crucial for the understanding and the limitations of the climate system predictability 15,16 .
However, the assessment of the added value for decision making should be based on an integrated perspective, considering what will be made available to end-users, for instance in the form of a set of indicators addressing climate risks along the growing season. Here, we explore the skill and the reliability of the European Centre for Medium-Range Weather Forecasts (ECMWF) seasonal forecasting system SEAS5 16 in predicting agro-climate indicators relevant for the European winter wheat-producing system. We use a recently proposed agro-climate service Clisagri 6 , developed in a co-designed approach with wheat farmers and agronomists. We provide a spatial assessment of seasonal forecast predictions at different stages of the winter wheat growing season, discuss the predictability of different types of climate events relevant for wheat farmers, and finally point to future opportunities linked to the increased lead time of skilful predictions.

Wheat phenology prediction
The main results point to skill in flowering prediction in central and eastern Europe and Turkey already in the November runs, when an entire growing season is still ahead in most of Europe (Fig. 1). Lack of skill can be observed in most of western Europe. Significant positive correlation between the ensemble median of predicted flowering dates and observed flowering dates (i.e., simulated dates based on observed climate) characterizes eastern Europe and Turkey. The reliability of flowering prediction indicates usefulness for decision making especially for the prediction of early and late flowering in central and eastern Europe. While, the prediction of normal category lacks reliability almost in entire Europe.
The skill in flowering prediction increases when using seasonal forecast runs initialized later in the growing season (from February to May, Fig. 1). The skill scores and the reliability improve thanks to the already observed period (used in the simulation of the Fig. 1 Skill and reliability of wheat-flowering prediction. a Fair ranked probability score for predicted wheat-flowering dates categorized in terciles (early/normal/late occurrence). The skill maps are shown for the seasonal predictions initialized in November, February, March, April, and May. Cross marks denote regions without prediction skill. b Correlation between observed flowering dates and the ensemble median of the seasonal predictions. Regions with significant correlation are denoted with "+". c Reliability for the probabilistic categorical forecasts of flowering occurrence (early/normal/late) in different regions of Europe. Similar to a and b, the reliability is presented for different forecast initializations.
A. Ceglar and A. Toreti flowering occurrence). For example, the simulation of the phenological development issued in March benefits from the observed climate data from sowing (autumn) until the end of February. This effect is visible especially in southern Europe, where flowering is reached earlier (generally by the end of May or beginning of June) compared to the rest of the Europe. The lack of skill persists in the Iberian Peninsula, France, and some regions of European Russia until the April forecast runs. With the flowering dates approaching, the runs initialized in April and May achieve skillful predictions in almost the entire Europe (Fig. 1).
The reliability of flowering prediction indicates usefulness for decision making in most of Europe for both the late and the early flowering categories after February. The normal category remains the most difficult to predict; in both February and March runs it remains marginally useful over most of Europe. However, there exists a partial positive relationship between model forecast probability and the observed frequency of the occurrence of the normal events; the exceptions being the Iberian Peninsula, the UK, and Ireland in February and central Europe in March (when predictions should not be used for decision making).
To illustrate the performance of predictions for extreme years, Supplementary Fig. 1 shows the observed flowering dates over the arable land in different European regions together with seasonal predictions initialized in February. The prediction quality of flowering occurrence in extreme years (shown for earliest and latest flowering occurrence in the period 1994-2018) is low in the Iberian Peninsula, UK and Ireland as well as France, which is in line with the lack of skill and reliability over these regions (Fig. 1). The extreme years are slightly better predicted in the other European regions.
Drought, wetness, and temperature stress The prediction skill and the reliability of these indicators are linked to the skill of the crop growth phenology prediction. The agroclimate indicators, here described, are divided into three groups: hydrological balance (SPEI), excessive wetness, and temperature stress.
Seasonal predictions initialized in November are used to calculate only those indicators entirely falling within the 7month lead time. To summarize the skillful predictions in the different regions of Europe, we estimate the share of arable land ( Fig. 4a) within each region where the FRPSS of the predicted indicators is higher than 0 (Fig. 2). The prediction of the SPEI between sowing and leaf development is largely skillful and (at least marginally) useful across Europe, with predictions over more than half of arable land showing at least some skill (indicator 2 in Fig. 2). Higher skill can be observed especially in regions where sowing takes place before November, and the indicator therefore integrates observed and forecast data initialized in November. While, lower but still significant skill can be observed in southern Europe, where the indicator is based entirely on forecast data as sowing occurs in November or early winter ( Supplementary Fig. 2).
The SPEI prediction for the tillering and stem elongation periods (indicators 3 and 4, Fig. 2) shows significantly lower share of arable land having some skill across Europe. Moreover, these predictions are either marginally useful or dangerously useless. Predictions initialized in February and later on generally result in higher share of arable land where indicator 3 is skillful and (at least marginally) useful. The reliability of drought forecast points to issues with prediction usefulness of indicator 4 in most of Europe, except for the Iberian Peninsula, Italy, and France. Predictions in problematic regions become useful when at least part of the stem elongation period has already been observed (i.e., calculations are made on a combination of observed and predicted climate conditions).
Indicator 5, representing the SPEI between wheat heading and maturity, can be predicted only with seasonal climate forecasts initialized in February and afterwards. The reliability is dangerously useless for the UK and Ireland for seasonal climate forecasts initialized in March, April, and May. This is surprising as the forecast initialized in February is skillful and useful over major part of arable land in this region. The wheat areas across the Iberian Peninsula, Italy, south-eastern Europe, and eastern Europe are gaining skill with seasonal predictions initialized later in the season. Even though the skillful area is increasing or remains stable with seasonal predictions initialized later in the season, only marginal usefulness prevails over most of Europe, until the indicator is estimated with seasonal climate forecast initialized in June. At that time, indicator 5 is estimated by integrating observed climate data from heading (usually occurring in May across Europe) until the beginning of June, and forecast data from the June forecast initialization. Given that at the time of the June forecast initialization winter wheat in most of Europe still has an entire grain filling period ahead, indicator 5 strongly depends on seasonal forecast.
Indicator 6 represents the SPEI between wheat sowing and maturity. Depending on the forecast initialization time, it is based on both observed weather and seasonal climate predictions ( Supplementary Fig. 2). For example, in February the observed weather is taken from sowing until the beginning of February, and is complemented with seasonal weather forecast providing data until maturity. It is encouraging that significant skill and reliability for drought prediction are already achieved in February, when at least half of the wheat growing season is still based on predicted climate conditions. Slightly lower share of arable land with skillful prediction compared to the rest of Europe can be observed in Italy, central Europe, and eastern Europe for predictions initialized in both February and March.
Lack of prediction skill can be observed for the excessive wetness indicators (12 and 13; Fig. 2). An exception is indicator 8 (number of days with rainfall amount above 10 mm during the tillering period), with the skill and the reliability progressively increasing with the forecast initialization time. Predictions of indicators 12 and 13 show no reliability and very limited skill in all the analyzed regions, limiting their usefulness for decision making.
On the other side, temperature indicators show slightly better skill across Europe. The number of days with minimum temperature below 2 ∘ C between booting and flowering (indicator 14) can be reliably predicted in the UK and Ireland already in February. While, the heat stress indicators 14 and 15 (number of days with maximum temperature above 31 ∘ C between booting and flowering, and the number of days with maximum temperature above 35 ∘ C between flowering and maturity) predictions show significant skill and reliability especially in the Iberian Peninsula, Italy, France, south-eastern Europe, and eastern Europe. Contrarily, in northern Europe, the UK and Ireland these two indicators have neither skill nor reliability.
Predictability of drought during the sensitive period between heading and maturity Drought between heading and maturity is of high relevance for wheat production, as it can have detrimental effect on crop productivity. Such an event occurred, for instance, in 2018 in central and northern Europe as well as southern part of European Russia (Fig. 3b). An extreme spring-to-summer drought in 2018 reduced wheat yields up to 50% in most the affected countries. The drought conditions in central and northern Europe was accompanied by wet conditions in the southern part of Europe, a quite rarely seen situation in last 500 years across Europe 17 . The uniqueness of this event makes it an ideal case study also from the seasonal predictability point of view.
The 2018 seasonal prediction of the SPEI between heading and maturity is able to detect the drought only in the run of June (Fig.  3a), while seasonal forecasts issued earlier in the season fail to detect it. The wet event in southern Europe is also forecast only in the June initialized run, although the indicator is based on longer period of already observed weather due to the earlier occurrence of heading and flowering than in central and northern Europe. Figure 3a shows another important feature, relevant especially for the operational use of seasonal forecasts to inform decision makers. The June forecast run is able to predict the dipole of water stress situation in Europe, however, with limited or no skill in many parts of Europe early in the season (e.g., February and April runs). In June, the skill substantially increases in southern Europe, while the forecast is without skill in parts of central, eastern and northern Europe as well the UK and Ireland, where the SPEI calculation mainly depends on forecast weather.
The capability of the seasonal forecasting system to reliably predict drought and wet areas is highly relevant, as such conditions can bring beneficial compensatory effect at the European level and limit the shock propagation (caused by heavy reductions in productivity). Crop yield losses in drought affected areas may, indeed, be compensated by gains in areas with sufficient water availability. This is especially relevant for wheat, as it is predominantly a rainfed crop. Figure 3c shows the share of European arable land affected by drought and wet conditions for each year between 1994 and 2019. Around a quarter of European arable land is annually affected by drought and wet conditions; the cumulative share of affected areas has peaked substantially above the average value in 2018 and 2019 due to the predominant impact of dry conditions. Drought between heading and flowering affected more than 20% of entire arable land in 2006 (south-western Europe), 2007 (south-eastern Europe), 2015 (central and southern Europe), 2018 (central and northern Europe) and 2019 (central and western Europe). Limited skill prevails in terms of accurate prediction of drought and wet affected areas in Europe for seasonal forecasts initialized in months between February and May, while the June forecast already achieves higher reliability at the European level (Fig. 3c)    The hydrological balance indicators are based on the SPEI and are calculated for different wheat growth stages. Gray areas are associated with indicators that cannot be predicted due to the too short lead time to reach maturity. b Same as a, but for excessive wetness and temperature stress indicators. Reliability is estimated for events corresponding to the third category representing overly wet and hot conditions.
European arable land, drought affected share is generally predicted with higher accuracy than its wet counterpart.

DISCUSSION
The findings of this study clearly indicate that seasonal climate forecast can be useful for decision making in the European wheatproducing sector. Wheat-flowering timing can be reliably predicted in central and eastern Europe as early as November; this is a prerequisite for making informed decisions on wheat variety selection and planning of agro-management practices early enough in the season. The predictability of climate events relevant for the European wheat production is strongly dependent on the forecast initialization time, as well as on the nature of the event to be predicted. Overall, we can distinguish two groups of agro-climate indicators with respect to their predictability: hydrological balance events (based on the SPEI) and events that are based on counting the number of days when rainfall or maximum daily temperature exceed specific thresholds.
The predictability of indicators based on the SPEI improves with the forecast initialization time along the growing season in most of Europe. The highest skill is estimated for the prediction of the hydrological balance aggregated over the entire wheat growing season (indicator 6) already at the end of winter and at the beginning of spring. The skill and the reliability of these predictions can be partially attributed to the observed weather integrated in the calculation of the SPEI. Significant skill of seasonal prediction is also observed in regions where the predictability of the hydrological balance at the end of the growing season (indicator 5) is not present in forecasts issued before June; such regions mainly appear in western and central Europe. This clearly indicates that the predictability of drought over the entire season relies on the persistence properties of hydrological balance anomalies, in line with the findings of ref. 18 .
The November forecast, which can only be used to predict events occurring before flowering (the lead time of 7 months is too short to reach maturity), is outperforming the climatology and can inform decisions already in early stages of the growing season such as the selection of crop variety to be sown, planning field ploughing and fertilization 6,9 . November forecasts can inform decisions for the period between sowing and tillering (indicators 2 and 3, Fig. 2a); while, the hydrological balance prediction between booting and heading (generally occurring in early spring) can be reliably predicted only in the Iberian Peninsula and Italy. This indicates an important impact of initial conditions in the early months of predictions, while the memory is progressively lost with increasing lead time.
On the contrary, the wetness indicators, based on number of days, have no or very limited skill on seasonal time scale, while only limited skill is present for temperature stress indicators. This is an important finding with implications for further research efforts to overcome these issues. Even though this kind of wetness indicators are highly relevant for the agricultural sector due to their capability to indicate the risk of disease occurrence, alternative approaches focusing on physical large-scale triggering mechanisms (such as atmospheric circulation, sea surface temperature anomalies, sea ice extent and others) could offer better skill on seasonal time scale [19][20][21] . Higher spatial resolution also increases the prediction skill of, e.g., the winter NAO and atmospheric blocking regimes 15 .
Drought between heading and maturity is of high relevance for decision making, as it affects the most sensitive stages of crop growth. The skill of predicting these events already early in the year is limited; although it increases with the forecast initialization time especially in south-eastern and eastern Europe. On the contrary, in western Europe (especially the UK and Ireland) and in many parts of central Europe lack of skill can be observed throughout the growing season. There is likely a limited impact of initial conditions on the skill of the forecast in western Europe especially during spring, when the predictability of transition atmospheric regimes becomes problematic. Too strong extratropical jets and northward displacement in warm half of year 16 The observed SPEI between heading and maturity in 2018. c Cumulative proportion of European arable land affected by drought and wet events during the heading and maturity period. Dark red (blue) colors indicate the share of drought (wetness) affected area that is correctly predicted by the seasonal climate forecast initialized in February, April, May, and June. Only the arable cropland where the FRPSS is positive is considered for the calculation of the predicted drought and wetness affected area. d Correlation between the SPEI calculated in two distinct periods: stem elongation-heading, heading-maturity.
further contribute to decrease in predictability in spring in western Europe. The bias in jet displacement can have important implications in atmospheric mechanisms leading to drought, such as blocking and Atlantic Low regimes, which had an important role in 2018 extreme drought in western and northern Europe 17,22 . The role of initial conditions becomes increasingly important in central, eastern, and north-eastern Europe 15 . The relevance of the initial conditions in predicting the hydrological balance between heading and maturity in eastern Europe is further corroborated in Fig. 3d, showing the correlation between the SPEI in two subsequent periods: booting-heading (indicator 4) and headingmaturity (indicator 5). Significant positive correlation is estimated for large parts of eastern Europe, indicating that late spring soil moisture conditions can be important in triggering high-impact events in the following summer.
The Clisagri indicators assessed in this study aim to estimate climate-induced risks relevant for winter wheat producers. We should stress, however, that the severity of the impacts can differ depending on prevailing soil properties. For example, drought and/or excessive rainfall events can lead to different effects on crops grown on sandy or clay soils. Even though our study provides an assessment of seasonal predictability of these climate events at European level, the indicator prioritization in terms of climate risk assessment should be based on prevailing environmental conditions such as soil properties.
Reliable seasonal forecasts along the European crop growing season can bring tangible advantages to all key players in the sector, from farmers to EU policy makers. Farmers can be supported in their critical decisions, e.g. on: sowing, wheat variety selection, field operations, effective and sustainable resource and infrastructure planning (such as irrigation). While at the policy level, seasonal forecasts can support market and trading stabilization, stock and distribution planning, and ad-hoc payments to deal with heavy losses.
Future work should focus on the assessment of higherresolution seasonal prediction, large-ensemble approaches using different forecasting systems, and alternative pathways to enhance the skill of seasonal climate predictions of extremes relevant for agricultural decision making. The latter point more than ever emphasizes the need to build climate service tools in a co-design approach with end users (such as farmers) and climate scientists, in order to effectively translate the untapped climate predictability on seasonal time scales into useful information for decision makers. Limited regional predictability may be also overcome by looking at large-scale teleconnection patterns and recently proposed mechanisms. For example, skill in summer seasonal climate prediction in central Europe could be improved by using spring sea surface temperature anomalies in the tropical North Atlantic (which seems to act on the region through an upper-level wave-train 23 ). Finally, advanced Earth system simulations at very high spatial resolution, expected under recently launched initiatives 24 , may provide higher seasonal predictability of surface climate at local-to-regional scales, with significant implications for skilful prediction of agriculturally relevant extreme climate events such as heavy precipitation and wind storms.

Climate data
SEAS5, the ECMWF's fifth generation seasonal forecast system 16 , is here used to derive seasonal predictions of the Clisagri agro-climate indicators. The SEAS5 seasonal forecasts were retrieved from the Copernicus climate change service (C3S, https://cds.climate.copernicus.eu/). To evaluate the SEAS5 prediction quality, we use a set of retrospective forecasts (reforecasts), emulating real predictions for the period between 1993 and 2019. The re-forecast ensemble consists of 25 members and reflects the uncertainty in the initial conditions of the ocean and the land state. Even though the spatial resolution in the atmospheric component is approx. 36 km, the forecasts are up-scaled to 1 degree in C3S.
Three key climate variables are here used: minimum and maximum daily temperatures, and daily total precipitation. Seasonal forecasts were retrieved for six different initialization times, spanning over most of the wheat growing season in Europe: November, February, March, April, May, and June. The lead time of all forecasts is 7 months. Given that the wheat growing season in Europe roughly spans between October (sowing and emergence) and July (maturity and harvesting), the forecast initialized in November is not able to cover the entire length of the growing period (although a 13-month run is available at ECMWF, but not in C3S). Therefore, only those events occurring within June are considered in the analysis of the November runs.
Seasonal forecasts have been bias-adjusted by using the quantile mapping approach 25,26 . The MarsMet dataset has been used as a reference dataset. The MarsMet dataset is interpolated on 25 km grid based oñ 4000 meteorological ground-weather stations across Europe 27 , and is primarily used to support the European agricultural crop production monitoring and forecasting system. For the purpose of bias adjustment, the MarsMet precipitation and temperature data have been up-scaled to the resolution of the SEAS5 data.

Agro-climate indicators
Agro-climate indicators assessed in this study are tailored to the winter wheat-producing system, and have been proposed and implemented in the agro-climate service Clisagri. Clisagri is based on a set of dynamic agroclimate indicators that bring relevant information on climate related risks during different stages of the winter wheat growth.
Physiologically, the following stages are generally distinguished along the crop growth cycle: germination, leaf development, tillering, stem elongation, booting, heading, flowering, grain filling and maturity (Fig. 4c). The occurrence and duration of each of these stages depend on sowing dates, crop varieties, daylength and prevailing climate conditions. Winter wheat in Europe is generally sown between early autumn (northern and north-eastern parts of Europe) and late autumn or early winter in the southern part of Europe ( Supplementary Fig. 3). Wheat varieties in northern and eastern Europe are characterized by higher vernalization requirements and lower thermal requirements to reach maturity than in the western and southern parts of Europe. The north-south spatial gradient can be observed also at the occurrence of different phenological stages, with southern and western Europe being earlier. Supplementary Fig. 4 shows the earliest and the latest occurrences of different phenological stages (considering the period between 1994 and 2020).
A dynamic approach has been implemented to assess the risks associated with unfavorable climate conditions and events during the sensitive crop phenological stages by integrating dedicated indicators with a dynamic model simulating the crop phenology. This model has been developed and calibrated by using field experimental and observational datasets across Europe for current winter wheat varieties 28 . The indicators are calculated for specific wheat growth stages that are highly sensitive to climate conditions. The timing of flowering is one of the most critical periods, when wheat is highly exposed to unfavorable climate conditions and extremes such as heat stress. Depending on the year, the difference between the earliest and the latest occurrence of flowering can be as much as 30 days in Europe (Fig. 4b).
Clisagri characterizes different climate events (including extremes): drought, excessive wetness, heat stress and cold stress. Drought is defined by using the estimated hydrological balance during specific growth stages (Fig. 4c). For this purpose, the standardized precipitation evapotranspiration index (SPEI) 29 , a multi-temporal-scale index quantifying persistent anomalies in soil water balance over different time periods, is used. SPEI is calculated dynamically for different growing phases (indicators 2-6, Fig.  4c). Its estimation is based on a non-parametric calculation procedure 6 , which translates empirical distribution of climatic water balance into standardized water balance anomalies. This information can support farmers on decisions related to crop variety selection, sowing date, irrigation planning, application of fertilizers and crop protection.
Excessive wetness indicators (8, 12, and 13) are based on counting the number of days with daily rainfall exceeding a specific threshold (based on farmers' feedback), and aim to capture climate conditions posing risks for: anoxia/hypoxia in the root zone, disease occurrence, and impaired canopy structure. These indicators can provide guidance on crop protection strategy, fertilization amount, and grain storage strategy. The last group of indicators (14-16) represents temperature stress due to cold or hot A. Ceglar and A. Toreti weather conditions and it is based on counting the number of days, within specific crop growth stages, when daily temperature falls or exceeds predefined thresholds (Fig. 4c). These indicators can be especially relevant for crop variety selection and irrigation planning.
While ref. 30 found that for the central-southern United States precipitation-based indicators are more useful to winter wheat farmers than temperature-based indicators, both types of indicators have been recognized as of equal importance in Europe 3,6,31 , as they can support different types of decisions.
Given the seasonal forecast initialization time, the calculation of Clisagri indicators depends on the integration of observed and forecast precipitation and temperature data ( Supplementary Fig. 2). Observed values are used until the time of forecast initialization, and are thereafter merged with seasonal forecasts to cover the entire period for each indicator. This type of merging can be justified as the seasonal forecast data are bias-adjusted. As for the November initialization, the majority of indicators are largely based on forecast values since the timing largely coincides with either sowing or the initial growth phases of winter wheat across Europe. While, higher share of observed data is integrated into indicators' estimation in subsequent forecast initializations ( Supplementary Fig. 2). For example considering the February forecast initialization, the SPEI during the entire growing season (indicator 6) is calculated by merging the observed weather data from sowing until the beginning of February and the seasonal forecast data until the end of the growing season.

Skill and reliability of seasonal prediction of Clisagri indicators
We divide our indicator predictions into categories relevant for agricultural decision making. Wheat-flowering dates are categorized into tercile classes reflecting flowering timing: early, normal, and late occurrence. The classification is performed for each grid point and it is based on the time series of flowering dates simulated between 1994 and 2019. The hydrological balance indicators (2-6) are divided into tercile categories: drought (SPEI < -0.84), near normal conditions (-0.84 < SPEI < 0.84), and wet conditions (SPEI > 0.84). Indicators, based on the number of days, are classified into three categories, as described in Fig. 4c; the latter were determined based on farmers' perception of the severity of the impacts.
The Fair Ranked Probability Skill Score (FRPSS) is here used as skill measure. FRPSS measures the squared distance between the cumulative probabilities of the categorical prediction and its corresponding observational reference. The predictions are evaluated with respect to a baseline, which represents the long-term climatology. Such an assessment provides users with information on the added value of the seasonal prediction system against climatology. Since the score is biased to finite ensemble size, ensemble-size-correction is applied 32 . The skill score is 1 for a perfect forecast, 0 for forecasts that do not perform any better than the reference forecast, and positive (negative) when the forecast is better (worse) than this reference. The reference forecast contains information only on the long-term mean agro-climate conditions, quantified by long-term mean indicator values based on observed climate.
Beside skill measures, the reliability is assessed by using the reliability diagram. The reliability diagram is a diagnostic tool for probabilistic forecasts showing for a specific event the correspondence of the predicted probabilities with the observed frequency of occurrence 33 . The uncertainty of the reliability slope is estimated by a bootstrap algorithm with replacement, randomly drawing from the set of forecast and observation data pairs, repeating the procedure 1000 times. The reliability of the system is determined according to the slope of the derived reliability line with its uncertainty range. Depending on the slope, five categories can be defined for decision making: perfect, very useful, marginally useful, not useful and dangerously useless. Reliability is calculated for arable land area in different regions of Europe (Fig. 4a) 34 .

DATA AVAILABILITY
All historical and seasonal forecast climate data are publicly available and open access, with the data sources listed in the Data and Methods. The other data that support the findings of this study are available from the corresponding author upon request.

CODE AVAILABILITY
Methods have been fully implemented in R and codes are available upon reasonable request. Fig. 4 Agro-climate indicators for winter wheat-producing regions. a Spatial distribution of European arable cropland (green areas) and the regions used in the seasonal prediction reliability assessment (delimited by red lines): IB-Iberia, IT-Italy, FR-France, SEE-south-eastern Europe, CE-central Europe, NE-northern Europe, UK & IR-UK and Ireland, and EE-eastern Europe. b Earliest (left panel) and latest (right panel) simulated day of winter wheat flowering in Europe based on observed meteorological data. c Agro-climate indicators used for the assessment of seasonal prediction skill and reliability. Each indicator is calculated for specific period during the wheat growing season (the time spans are indicated by brown boxes; adapted from ref. 6 ). There are three groups of indicators based on: hydrological balance (SPEI calculated for the shown periods), excessive wetness, cold stress and heat stress. Thresholds for multi-categorical skill assessment are shown on the right side of the indicators.