Forest age estimation in northern Arkhangelsk region based on machine learning pipeline on Sentinel-2 and auxiliary data

Tree age is one of the key characteristics of a forest, along with tree species and height. It affects management decisions of forest owners and allows researchers to analyze environmental characteristics in support of sustainable development. Although forest age is of primary significance, it can be unknown for remote areas and large territories. Currently, remote sensing (RS) data supports rapid information gathering for wide territories. To automate RS data processing and estimate forest characteristics, machine learning (ML) approaches are applied. Although there are different data sources that can be used as features in ML models, there is no unified strategy on how to prepare a dataset and define a training task to estimate forest age. Therefore, in this work, we aim to conduct a comprehensive study on forest age estimation using remote sensing observations of the Sentinel-2 satellite and two ML-based approaches for forestry inventory data, namely stand-based and pixel-based. We chose the CatBoost algorithm to assess these two approaches. To establish the robustness of the pipeline, an in-depth analysis is conducted, embracing diverse scenarios incorporating dominant species information, tree height, Digital Elevation Model (DEM), and vegetation indices. We performed experiments on forests in the northern Arkhangelsk region and obtained the best Mean Absolute Error (MAE) result of 7 years in the case of the stand-based approach and 6 years in the case of the pixel-based approach. These results are achieved for all available input data such as spectral satellites bands, vegetation indices, and auxiliary forest characteristics (dominant species and height). However, when only spectral bands are used, the MAE metric is the same both for per-pixel and per-stand approaches and equals 11 years. It was also shown that, despite high correlation between forest age and height, only height can not be used for accurate age estimation: the MAE increases to 18 and 26 years for per-pixel and per-stand approaches, respectively. The conducted study might be useful for further investigation of forest ecosystems through remote sensing observations.

Forest age is an important proxy for forest management and conservation 1 .It is used for timing of management activities, harvesting and estimation of resource stocks (such as carbon or timber).Forest age is vital in ecological studies as it influence on soil water behaviour and hydraulic conductivity 2 and helps to estimate wildfire risk 3 .Together with other forest parameters, forest age is usually measured by means of field studies; however, these methods are labour-intensive and not economically feasible on a large scale 4 .Advanced sensing techniques allow one to collect Earth observations with required spatial-temporal properties 5 .Therefore, contemporary approaches in environmental studies are focused on remote sensing data implementation in forest properties estimation tasks 6 .
A common pipeline to estimate forest age usually follows this sequence: an intermediate variable such as height, stock volume, crown projection area, or another is estimated from remote sensing data and then converted to age.For instance, forest age is highly correlated with forest height 7 .The development of biological systems over time is associated with their growth and is strongly dependent on plants, particularly trees.Therefore, during growth and maturation a tree age and height can change according to the same dynamics, and a number of studies used this relationship to estimate the age of the tree 8 .Mostly, linear dependency between age and height

Forest inventory data analysis
The study area is located in the north of the European part of Russia, in the Arkhangelsk region.The relative location is shown in Fig. 1.The region of interest has coordinates between 45 • 9 ′ E and 47 • 9 ′ E longitude and between 61 • 18 ′ N and 62 • 12 ′ N latitude.
The forestry inventory data was collected in 2018 for several forestries in Arkhangelsk region, covering a total area of 126-641 ha.It is presented as vector georeferenced data and includes boundaries for 12,617 forest stands.Average forest stand size is 10 ha.The term forest stand refers to a specific area or section of a forest that is relatively uniform in terms of its composition and structure 21 .Each individual stand is supplied with information about main forest characteristics.Figure 2 depicts several stands in inventory data.
In this study, we consider the following forest characteristics: • Forest stand species structure.The composition formula is ascertained by calculating the percentage of each tree species presented in the stand.This composition is presented in percentage points (composition coefficients), where each point equates to a ratio of 10% of total stand stock.We use forest composite formula to define dominant species (species with the content more than 50% within a stand).• Mean stand age (years).The average age of model trees belonging to the given stand (for a fixed type).A model tree is an average tree in its parameters, in our case such a parameter was the diameter at breast height (DBH).
It should be noted, that trees with DBH ≥ 1 cm are recorded in the forest inventory data.• Mean stand height (m).The average height of the dominant part of the stand (weighted average height between the species involved in the composition of the stand, based on composition coefficients).
Seven species types are presented in stand structure data: spruce, birch, pine, fir, aspen, willow and alder (the scientific names for the species are outlined in Table 1), however, not all of these species are present in large enough numbers.To account for only the most prevalent tree species, we consider the dominant forest species, which is a species that occurs in the formula at a frequency greater than 50%.If there is no dominant species or if two species occur equally, the stand is considered mixed.
The distribution of stands by areas is shown in Fig. 3.The size of the stands varies from 0.1 to 646 ha.About 95% of the stands are smaller than 32 ha, and half of them are smaller than 5.6 ha.Since the Sentinel-2 data used has a spatial resolution of 10 m, an image patch of 10 by 10 pixels corresponds to an area of 1 hectare.Therefore, half of the stands can be described using 560 pixels or less.
According to forest inventory data, the age distribution is heterogeneous, as can be seen in Fig. 6: there are many stands with ages over 100 years in the north, while there are very few such stands in the south.

com/scientificreports/
There is a correlation between age and tree height, and the relationship is different for different tree species.The correlation between tree height and age is more pronounced with older trees, while for younger trees it is well-aligned with a straight line (Fig. 4).This correlates well with the study 22 , which notes that the relationship between height and age weakens for trees over 100 years old.
According to Fig. 4, the height and age correlation in birch, aspen, and pine stands can potentially be estimated through a linear function, albeit mainly for young stands.In spruce-dominated or mixed forests, however, Table 1.The scientific names with authorities for the species cited in the study.To enable further analysis and the application of machine learning techniques, the vector forest inventory data is rasterized to a spatial resolution of 10 m, which matches the spatial resolution of Sentinel-2 satellite imagery.This was done with Rasterio 23 , a Python library used for reading and writing geospatial raster data.The pixel of the resulting raster inherits the value of the polygon that owns the center of the pixel.If the center of the pixel lies exactly on the boundary of the vector, then one of the equal values of the adjacent polygons is taken.In any case, there are very few edge pixels compared to the entire polygon size, and any mistakes on boundaries can be ignored.After rasterizing, we get a set of maps with pixel values equal to the corresponding value in vector format, for example, age map or dominant species map.More details about the distribution of the rasterized inventory data are depicted in Fig. 5.

Test splits
Forest characteristics are distributed non-uniformly, for example, as it is shown in Fig. 6, there are a lot of old stands on the north comparing to the south territories.This spatial heterogeneity was the reason why two test sets ( Test forestry and Test random ) were chosen to evaluate the quality of the models.To validate the case of age prediction on new territories, we are using Test forestry corresponding to a separate forestry.To check model quality on the same species distribution as in train and validation sets, we use Test random .It is collected from random pixels covering Region 1 and Region 2. Specifically, we divided pixels corresponding to Region 1 and Region 2 randomly into three parts: train (for model training), validation (for tuning hyperparameters of the model) and Test random in the proportion 0.68, 0.12 and 0.2, respectively.The Test forestry area is 13-754 hectares.Region 1 and Region 2 have a combined area of 112-887 hectares.

Sentinel-2 data and digital elevation model
In this study, we used multispectral Sentinel-2 data with Level-2A (L2A) preprocessing that includes atmospheric correction.10 spectral bands (B02, B03, B04, B05, B06, B07, B08, B8A, B11, B12) and SCL mask (Scene Classification Layer) with spatial resolution of 10 m were used.Images in channels B05, B06, B07, B8A, B11, B12 and SCL mask have source resolution 20 m, so they were upsampled with nearest neighbour interpolation method to For training and testing, we filtered data and used only those pixels that had value 4 in SCL mask, which corresponds to vegetation class.By this step we tried to filter non-forest pixels, specifically, cloud pixels.With a reasonable degree of accuracy, non-forest regions can be excluded through the described method, since satellite image pixels from this region correspond either to forest, deforestation areas, or clouds.There are two main reasons for this.Firstly, the area of interest is situated far away from any settlements or other non-forest land features such as crop fields.Therefore, all vegetation in this area is only forests, nothing else.Secondly, forest is only inspected during the growing season; consequently, those areas should have a leafy appearance and be in class 4 on the SCL mask.Alternatively, a more detailed forest mask can be obtained using deep learning segmentation method based on Sentinel-2 images that was in our previous study 25 .
There are two main restrictions for choosing the date of satellite image: (1) absence of cloud cover, (2) months of vegetation period (June, July, August, September).The manually selected data for the study are summarised in Table 2.
As an auxiliary data, we also used Digital Elevation Model (DEM) measurements available in Ref. 26 .The original spatial resolution is 1 m, downsampled to 10 m to match Sentinel 2 raster imagery using the rasterio.warp module (nearest neighbour resampling method).

Vegetation indices
Vegetation indices are known as important additional features for forest characteristics prediction 27 .In this study, we also computed a number of indices (Table 3): • Normalized difference vegetation index (NDVI) is the standard vegetation index assessing whether or not the target being observed contains live green vegetation.• Normalized difference water index (NDWI) measures the relative water content in vegetation and soil based on the difference in the absorption features of water and chlorophyll in the near-infrared region.• Soil-adjusted vegetation index (SAVI) is an enhancement of NDVI for highly vegetated areas, which reduces the influence of the soil on the vegetation detection 28 .For SAVI was used the value of the constant L = 0.428 .• Atmospherically resistant vegetation index (ARVI) is an improvement on NDVI that minimizes its influence on atmospheric information contained in the blue channel 28 .For ARVI was used the value of the constant y = 0.106.• Enhanced vegetation index (EVI) is an index that mitigates the effect of atmospheric influence on the remotely sensed signal and improves accuracy in determining vegetation coverage.

Experiments
In this study, we aim at formulating and investigating a set of possible options to develop an ML-based pipeline for forest age prediciton through Sentinel-2 data.The details are described below in "Experimental" sections.In terms of ML task definition, the task is to solve a regression problem based on various spatial features.
As ML model, we chose a CatBoost algorithm available in a high-performance open-source library 29 .CatBoost is based on gradient boosting on decision trees.During training, a set of decision trees is built consecutively so that each successive tree is built with reduced loss compared to the previous trees.
Catboost is known for its speed and accuracy, it is widely used for tabular data in environmental studies 30 .It has a fast implementation on both CPU and GPU.Compared to standard Random Forest from sklearn or gradient boosting realisations such as LightGBM or XGBoost, Catboost is tens of times faster if only CPU is available, and hundreds of times faster if GPU is available 31 .This means that for a large area with many pixels, one can get results with the same or better accuracy in a foreseeable time.This in turn allows for quick experimentation with different setups and input features for forest characteristics prediction.Therefore, the Catboost algorithm optimally serves the main goal of this study as a standardised ML algorithm.
The study workflow is depicted in Fig. 7.It involves Sentinel-2 images and a set of supplementary data.The target variable is forest age derived from field-based measurements.Per-pixel or per-stand approaches can be utilized to create a dataset.We examine both of them.Data filtering is required to exclude irrelevant samples with corrupted values.Two types of test choosing is considered and discussed.The following sections describe details of the main approaches that we studied.

Evaluation metrics
To evaluate prediction results for different experimental setups, we considered the following commonly used metrics: R 2 (Coefficient of Determination), Mean Absolute Percentage Error (MAPE), Root of Mean Squared Error (RMSE), and Mean Absolute Error (MAE).They are computed in per-pixel manner using the next formulas:   where ŷi is the predicted value of the i-th pixel and y i is the corresponding true value ( ε is an arbitrary small yet strictly positive number to avoid undefined results when y is zero).Using pixel-level metrics enables us to consider prediction errors weighted by the area of forest stands.

Experiment 1: stand-based and pixel-based approaches
In this experiment, we compare two main approaches for forest characteristics prediction using multispecral images only: stand-based and pixel-based.
In stand-based approach, we use entire stand characteristics to create a single training sample based on statistical values (min, max, std, and mean).Thus, for each forest stand, we extracted the minimum, maximum, standard deviation, and mean value of raster pixels that fall into each stand polygon.This was done for all multispectral and auxiliary data stored in raster format.In this setting, we implicitly use information about borders in the forestry, since we average multispectral and supplemented values across each stand.The obtained training sample is a forest stand.In pixel-based approach, each pixel is treated independently to the stand it belongs.The training sample in this case is a single pixel with features corresponding to Sentinel-2 and auxiliary data.
We compare three main scenarios for age prediction: • train on stands, predict on stands, or stand-to-stand ( −→ ); • train on stands, predict on pixels, or stand-to-pixel ( −→ •); • train on pixels, predict on pixels, or pixel-to-pixel The setup of the experiment is as follows.
(1) There is no Test random set as it could not be properly introduced for stand based approaches, so evaluation metrics ( MAPE , RMSE , MAE , R 2 ) are calculated on Test forestry set.
Dates.We have carefully constructed the train and test sets to ensure that our model can generalize well to new data.Specifically, in the train set, we have excluded one date per year so that the model can learn to predict tree ages across multiple dates.Additionally, we have excluded all images from the year 2021 in the train set to evaluate how well the model can generalize to new years.To create ground truth age values, we used rasterized field inventory data observed in 2018 and added 1.0, 2.0, and 3.0 years per pixel for 2019, 2020, and 2021, respectively.This ensures that our ground truth data accurately reflects the age of each tree in the images.
Stand-to-stand approach −→ .In the stand-to-stand approach, the training tabular data was generated as follows.First, based on rasterized inventory data, pixels corresponding to each forest stand were selected.For each stand, four statistics (standard deviation, mean, minimum, and maximum values) were calculated for each individual input feature (such as spectral bands and vegetation index).The statistics for different dates were then stacked together as a single sample in the dataset.Then, on prediction stage for given stand in test dataset, we make a prediction of stand age based on averaged statistics calculated out of pixels corresponding to the given stand.The metrics are always calculated per-pixel.
Stand-to-pixel approach −→ •.The training set is prepared as in the previous approach; the difference is in the testing stage.In this setting, we treat each pixel independently from stand boundaries in test dataset and apply the same model trained on stand characteristics to each pixel.In this case, mean, maximum, and minimum values equal to pixel values itself, and standard deviation equals to 0.
Pixel-to-pixel approach • −→ •.The training set consist of pixel values for all 10 bands.In the testing stage, age was predicted in a per-pixel manner.

Experiment 2: aates for training
A common choice to validate an ML-based approach for forest characteristic estimation on a local scale is to use the same satellite image covering neighborhood territory.In Ref. 32 , it has been shown that such an approach can lead to inappropriate results for new observation dates even if they are close to the training date.To examine forest age prediction for various dates, we conducted an additional experiment.
The objective of this experiment is to assess the reliability of the algorithm across various dates.The experimental setup involved training a CatBoost model for each available date in a pixel-to-pixel setting, and calculating the metrics for the model on the remaining dates.This training and validation procedure was repeated for all available images.
For our experiment, we utilized multispectral imagery data.To determine the stand age values for 2019 and 2020, we used ground truth data from raster field inventory observations conducted in 2018, and added one or two years to each pixel to account for the time elapsed.This allowed us to accurately assess changes in the stands over time.

Experiment 3: forest species
In this experiment, our aim was to understand model capability to predict forest age on several forest species individually.We compared a model trained using age measurements for all dominant species and 5 individual models trained only on stands of each given dominant species (spruce, birch, aspen, pine, or mixed).In this experiment, we used pixel-to-pixel approach with Sentinel-2 bands and DEM for only one date (2018-07-30).It is important to recognize that in the experiment involving five regressors, the amount of training data available for each regressor was reduced.However, this reduction in data size was counterbalanced by an increase in data purity, as each regressor was trained exclusively on the pixels belonging to a particular tree specimen.

Experiment 4: choosing bands and indices
The objective of this experiment was to investigate the impact of additional input features on the accuracy of the prediction.We conducted a pixel-to-pixel analysis, comparing various combinations of available data.We used images from a single date (2018-08-27).Forest species information, categorized as Type of Stand, was included as a categorical feature based on inventory data.

Experiment 1: stand-based and pixel-based approaches
In this experiment, we compared different approaches to create training samples for forest age prediction when inventory data is used as a ground-truth values.Results for stand-to-stand, stand-to-pixel, and pixel-to-pixel approaches are presented in Tables 4, 5 and 6, respectively.
The stand-to-stand approach produces the best results for both R 2 and MAPE, with an average R 2 value of 0.91 and an average MAPE of 0.18.However, this approach has a major disadvantage in that it relies on stand boundaries, which may not be available for new areas.In this regard, the stand-to-pixel and pixel-to-pixel approaches are more appropriate.The pixel-to-pixel approach yields better results in terms of both MAPE and R 2 , with an average MAPE of 0.23 and an average R 2 of 0.79, compared to an average MAPE of 0.29 and an average R 2 of Table 4. Experiment with the stand-to-stand approach ( −→ approach).Bold values correspond to dates that are both in train and in test sets.Only multispectral features were used for this experiment.0.32 for the stand-to-pixel approach.These findings suggest that using statistics calculated over a stand is not sufficient to predict forest age in a per-pixel manner with enough accuracy.We should note that although there are no images from the entire year of 2021 in the training set, we can still obtain acceptable results for these dates.For instance, in a pixel-to-pixel setup, the model achieves a mean absolute percentage error (MAPE) of 0.246 and 0.192 for 6th of August, 2019 and 7th of June, 2021, respectively.However, high results are not achieved for all dates.For example, the model fails to obtain high results for the observation gathered on August 28th, 2021, with the MAPE dropping to 0.313.A possible explanation for such model behavior is that the image was obtained at the end of summer, and the spectral characteristics changed drastically.However, a rather close date of August 21st, 2020, which also did not appear in the training dataset, has a MAPE of 0.192.
Moreover, the dates with poor prediction quality are the same for all approaches.There are some dates (such as 27th of August, 2018, and 28th of August, 2021) that are not present in the training set, and for which the predictions are still poor for all three approaches.Thus, for 27th of August, 2018, stand-to-stand, stand-to-pixel, and pixel-to-pixel approaches yield just MAPE of 0.412, 0.489, and 0.316, respectively.This means that poor predictions may go unnoticed if the date of the satellite image to be predicted is not included in the training set.The effects of different choices for the training and testing dates will be explored more in the next experiment.

Experiment 2: dates for training
In this experiment, we examined different sensing dates and model capability to predict forest age for previously unseen dates.For each date, we trained an individual model and then evaluated it on all other dates.Therefore, test samples present both previously unseen regions and spectral features from new dates.We used pixel-to-pixel setup and validated the results on Test forestry set.Experimental results are presented in Fig. 8.The main diagonal represents testing on the same date used in training.In most cases, these values are the best.The average MAPE for the diagonal values is 0.124.However, there are a few dates (11th of September, 2018, and 4th of June, 2020) where predictions are poor for all test dates, unless the training date coincides with the test date.The average MAPE computed for the test date of 4th of June, 2020, equals to 0.257, while when we train and validate the model using this single image the MAPE equals to 0.129.
In general, the quality of the prediction is higher when the test and training dates coincide.Prediction examples for different dates are depicted in Fig. 9.

Experiment 3: forest species
In this experiment, we aim to assess how dominant species affect predictions.We compared a single model trained employing age measurements across all prevalent species (Table 7), with five distinct individual models (Table 8) trained exclusively on stands characterized by each respective dominant species, namely spruce, birch, aspen, pine, and mixed-species stands.For these experiments, we applied the pixel-to-pixel setup.
The overall MAPE for a single regressor is 0.197, while for separate regressors we have an average MAPE of 0.126 and a maximum MAPE of 0.184.Thus, having a separate regressor for each tree specimen reduces the MAPE.
We can observe that learning patterns from a dataset of all species is more challenging for the model compared to a dataset of a single specimen.For instance, age prediction for birch and spruce is more accurate when predictions are made by an individual model rather than a shared model for all species.This trend is observed for both the Test random and Test forestry sets.Additionally, a specific model for pine and aspen performs better in Test random , but not in Test forestry due to the insufficient amount of stands with these dominant species in the Test forestry set.
However, this does not apply to Mix stands.For this complex type of stands, it is irrelevant whether the train set has become homogeneous or not.In fact, the predictions for this type of stand became worse because the train set decreased.

Experiment 4: choosing bands and indices
This experiment compares different feature sets to understand which source of information is more significant for predicting the age of a forest.We trained an ML model using different subsets of features and compared the results on both the Test random and Test forestry sets.Tables 9 and 10 show the corresponding results.
Knowledge of stand dominant species and stand height greatly improves the results, yielding the MAPE of 0.08 and 0.124 for Test random and Test forestry .However, using multispectral data alone still provide good quality predictions, with MAPE of 0.223 for both test sets.One of the most important features for forest age prediction is B08 band, which corresponds to the NIR spectral range with an initial resolution of 10 m.Other bands have a smaller effect on the result and for the sake of compactness we have only shown band B03 in the table.The MAPE is slightly reduced by band B03 by 0.02 for Test random and 0.04 for Test forestry (all other bands have a similar effect).
Adding vegetation indices listed in Table 3 to the Sentinel-2 data slightly reduces MAPE by 0.01 for Test forestry and has almost no effect for Test random .The incorporation of NDVI, NDWI, SAVI, ARVI, and EVI yields a minor yet a positive effect on the estimation of forest age.
When combined with Sentinel-2 bands, forest height derived from inventory data ( Height stand ) produces a MAPE of 0.104 and a R 2 of 0.951 on the Test random set, and a MAPE of 0.148 and R 2 of 0.913 on the Test forestry set, i.e. utilizing information about height significantly improves the age predictions.
Visualised ground-truth and model predictions using different input features are shown in Fig. 10.All available information gives predictions that are closest to the ground-truth.However, the use of satellite imagery alone, without any information on forest inventory properties, still gives good results for forest age prediction for the new areas.

Discussion Experiment 1: stand-based and pixel-based approaches
Stand-based and pixel-based approaches are commonly used strategies to predict forestry variables.However, typically only one approach is utilized in a given study.To address this limitation, we conducted a comparative analysis of stand-based and pixel-based approaches, highlighting the respective strengths and weaknesses of each.
The per-stand approach has a main disadvantage, which is the requirement of stand boundary information that may not be available in practice.Experimental results have shown poor performance for model prediction in the stand-to-pixel setup.Nonetheless, this approach holds promise for future investigation, as it has the potential to create a robust transfer between stand-based training procedures and pixel-based inference.
In this study, we focus specifically on different ML-based pipelines for forest age estimation.A more detailed look at deep learning models applications can be inspired by the obtained findings.For instance, advanced computational techniques can be applied to conduct future comprehensive analysis with neural networks.
Previous studies have shown high performance for forest age estimation on a large scale with low spatial resolution of 1 km 12 .It is important to notice, that the present experiments are conducted with the spatial resolution of 10 m per pixel.It provides a rather detailed forest age map that can be further applied for environmental studies such as carbon stock estimation and ecological condition assessment.In the previous studies, it has been shown that geo-spatial distribution plays a vital role in environmental tasks 33 .When a model is designed to process new territories, it assumes that new observation dates are considered.Therefore, we studied the effect of transferring between different dates.The quality of the prediction is usually higher when for training and testing the same date is used.Nevertheless, for regions that are remote from the studied area, images with the same date of observation can be not available.We highlighted its importance for practical applications development and comprehensive study conduction.The main challenge is to identify

Experiment 3: forest species
In Ref. 15 , quality of forest age estimation was measured for a single species stands, namely for spruce stands.In our study, we aimed to assess how forest species affect prediction results.The achieved results show that separate models perform better than a single shared model that predicts forest age for all forest species simultaneously.It is more difficult for the model to learn patterns from a dataset of all species than from a dataset of a single specimen.Indeed, age prediction for birch and spruce is better when predicted by a specific model rather than by a single model for all species, which could be seen in both Test random and Test forestry .In summary, when a forest species map is available, it is better to use a single model for each of the given species.

Experiment 4: choosing bands and indices
Experimental results demonstrated that knowledge of stand dominant species and stand height greatly improves the results.However, in practice, this data is not available in most cases and all forest variables should be estimated based on remote sensing observations.Therefore, the main attention should be pay to remote sensing approaches.The obtained results show high perspective to develop efficient pipelines for forest age estimation.Moreover, it encourages further development of algorithms for precise forest characteristics estimation such as height or biomass 34 .They can be based on satellite data of different properties and further integrated into an entire pipeline for forestry analysis.
Incorporating observations for different dates and creating time series might be effective for additional spectral features extraction.For instance, the achieved findings on forest age estimation can be aggregated with results for Sentinel-2 time series exploration in land use classification 35 .In addition to summer images, one can utilise winter observations for conifers as an additional source of spectral characteristics.
In summary, it is worth noting that the experimental results obtained are consistent with previous studies.However, due to limitations in open-access forest inventory datasets, it is difficult to compute metrics on the same test sets for the same regions.Directly comparing metrics is also not very informative due to differences in environmental conditions among regions.Nevertheless, the achieved results are consistent with those reported in other studies.For example, in Ref. 15 , the authors reported an RMSE of 11.5 years for spruce stands using Sentinel-2 data.
To boost model performance, additional vegetation indices can be further considered such as an adaptive vegetation index or others proposed in Refs. 36,37.In the present experiments, we also showed the importance of NIR spectral band.A future study might consider utilizing of artificially generated NIR band 38 .

Conclusions
The present work addresses the challenge of estimating forest age through remote sensing data.Accurate forest age maps are crucial both for environmental studies and for making informed management decisions.Developing these maps requires optimal training strategies for remote sensing observations and machine learning (ML) algorithms.The specific definition of training samples and the train-test split strategy can affect the ultimate results.Therefore, we formalized and considered a set of possible options to create an ML-based pipeline for forest age estimation based on Sentinel-2 images.We compared auxiliary features such as DEM and forest inventory information; we studied the model's performance with two train-test split strategies and examined per-pixel and per-stand dataset sampling.Additionally, we studied the train-test image selection with different observation dates.The obtained mean absolute error (MAE) metric for per-pixel forest age estimation was 11 and 7 years for just multispectral images and for Sentinel-2 with DEM and forest inventory information, respectively.

Figure 1 .
Figure1.Study area and regions with available forest inventory data.Yellow region (Test region) corresponds to Test forestry .Green regions 1 and 2 is randomly split into training, validation, and Test random sets.The map was generated with the QGIS v.3.14 software (https:// qgis.org/ en/ site/) and RGB satellite composite from Google Maps layers available in QGIS.

Figure 2 .
Figure 2. Example of used forest inventory data with boundaries of individual forest stands.The map was generated with the QGIS v.3.14 software (https:// qgis.org/ en/ site/) and RGB satellite composite from Google Maps layers available in QGIS.

Figure 3 .
Figure 3. Distribution of stands by areas.

Figure 4 .
Figure 4. Correlation between age and height for each forest species in inventory data.

Figure 5 .
Figure 5. Distribution of age, dominant species, heights, and timber stock in rasterized inventory data.

Figure 6 .
Figure 6.Forest age distribution from inventory data throughout the study area displays non-uniformity, with older stands being predominant in the north and younger ones in the south.The map was generated with the QGIS v.3.14 software (https:// qgis.org/ en/ site/) and RGB satellite composite from Google Maps layers available in QGIS.

Figure 7 .
Figure 7. Study workflow.A set of input spatial features are considered to predict forest age.The workflow includes per-pixel and per-stand approaches to create training set based on inventory measurements.

Figure 8 .
Figure 8. Experiment with different training and testing dates.Multispectral data were used for this experiment.(a) R 2 (the ligther the better), (b) mean average percentage error (the darker the better).

Figure 9 .
Figure 9. Experiment with different training and testing dates.Ground-truth and predictions for each test date in window of 300 px × 300 px (3 km × 3 km).Multispectral data were used in this experiment.The train date is 2018-07-30.

Table 2 .
Dates of selected Sentinel-2 images for each studied region.

Table 3 .
Vegetation indices considered in the study.

Table 5 .
Experiment with stand-to-pixel approach ( −→ • approach).Bold values correspond to dates that are both in train and in test sets.Only multispectral features were used for this experiment.

Table 6 .
Experiment with pixel-to-pixel approach ( • −→ • approach).Bold values correspond to dates that are both in train and in test sets.Only multispectral features were used for this experiment.

Table 7 .
Experiment for different forest species.One model for all species.Multispectral and DEM data were used. (a

Table 8 .
Experiment for different forest species.An individual model is trained for each specimen.Multispectral and DEM data were used. (a)

Table 9 .
Experiment with different input features.The results are declared for Test random .The highest model results are in bold.

Table 10 .
Experiment with different input features.The results are declared for Test forestry .The highest model results are in bold.