Regional height growth models for Scots pine in Poland

Site productivity remains a fundamental concern in forestry as a significant driver of resource availability for tree growth. The site index (SI) reflects the overall impact of all environmental factors that determine tree height growth and is the most commonly used indirect proxy for forest site productivity estimated using stand age and height. The SI concept challenges are local variations in climate, soil, and genotype-environmental interactions that lead to variable height growth patterns among ecoregions and cause inappropriate estimation of site productivity. Developing regional models allow us to determine forest growth and SI more appropriately. This study aimed to develop height growth models for the Scots pine in Poland, considering the natural forest region effect. For height growth modelling, we used the growth trajectory data of 855 sample trees, representing the Scots pine entire range of geographic locations and site conditions in Poland. We compared the development of regional height growth models using nonlinear-fixed-effects (NFE) and nonlinear-mixed-effects (NME) modelling approaches. Our results indicate a slightly better fit to the data of the model built using NFE approach. The results showed significant differences between Scots pine growth in natural forest regions I, II, and III located in northern Poland and natural forest regions IV, V, and VI in southern Poland. We compared the development of regional height growth models using NFE and NME modelling approaches. Our results indicate a slightly better fit to the data of the model built using the NFE approach. The developed models show differences in height growth patterns of Scots pines in Poland and revealed that acknowledgement of region as the independent variable could improve the growth prediction and quality of the SI estimation. Differences in climate and soil conditions that distinguish natural forest regions affect Scots pine height growth patterns. Therefore, extending this research to models that directly describe height growth interactions with site variables, such as climate, soil properties, and topography, can provide valuable forest management information.

Site productivity remains a fundamental concern in forestry as a significant driver of resource availability 1 . Appropriate estimation of site productivity is critical for forest management decisions 2 . Additionally, information on site productivity helps monitor and assess climate change impact on forest ecosystems and provide information to support forest management to take effective adaptation measures 3 . The site index (SI) reflects the overall impact of all environmental factors that determine tree height growth 4 and is the most commonly used indirect proxy for forest site productivity estimated using stand age and height 2,[5][6][7][8][9] . An appropriate height growth model is needed to determine the SI accurately.
One of the most critical challenges in the SI concept is a local variation of height growth patterns that result from a variety of factors and could cause the under-or over-estimation of the growth potential and productivity of a site 10,11 . Species variability caused by sensitivity to environmental factors is the primary reason why a more flexible approach to developing height growth models is recommended to include the impact of local conditions and to accurately reflect height growth variation on a regional scale 12 .
Climate, soil, and genotype-environmental interactions can favour species adaptation to local ecological conditions, causing variable height growth patterns for the same species among ecoregions 10,11 . In a study conducted in Sweden, Johansson 13 found significant variation in forest height growth curves due to mineral soil type and geographical location. The effect of genetic variability on the height development of trees was reported by Adams et al. 14 . Genetic variability has also been identified as the primary factor modifying the trajectory of height growth by Buford and Burkhart 15 , who found that the height growth models developed for different provenances differed in their parameters describing the asymptote. Significant differences in height growth www.nature.com/scientificreports/ patterns can also occur for the same species, even within a given SI class 16 . García 17 reported that three sources of height growth development variability could be identified: (a) between sites, (b) within sites, and (c) observation error. Because of the inter-regional variability of growth patterns, regional height growth models (RMs) are derived to improve the model's predictive ability 11,12,[18][19][20] . Studies on Pinus pinaster in Spain indicated significant differences in growth patterns observed in different areas because of the effect of genotype-environment interaction 11 . Research describing regional height growth differences highlights the regional application of the SI concept 18,[21][22][23][24] . The role of regional models may be crucial for sustainable forest management, particularly in solving problems that require accurate information on local growing conditions, where climate and soil type lead to inter-regional variation 20 .
The Scots pine is one of the most important species in Europe and ranges from the boreal region in Northern and Eastern Europe to the mountains of the Mediterranean in Southeastern Europe (Fig. 1). It is also the most important tree species in Poland, covering approximately 58% of the total forested area of the country 25 . The Scots pine grows under very variable ecological conditions 26 , which could affect the difference in the shape of the growth curves among ecoregions 13 . The effective management of such a considerable forest resource requires a thorough assessment of the productivity of stands at a regional level 27 . Growth modelling is crucial in assessing the consequences of forest management activities in long term planning in forestry 28 . The use of inappropriate height growth models may result in erroneous estimation of site productivity. This problem could be solved by the development of models, which reflect the difference in growth trajectories resulting from specific growth conditions in a given area. Regional models reflect the impact of regional variability on height growth dynamics resulting from specific climate conditions, soil properties, and differences in local forest management.
In Poland, yield tables with discrete scales are still used in forest management for forest growth and site productivity estimation [29][30][31] . The forestry practice reports the incorrect determination of site productivity and incremented volume using yield tables, probably because of the inadequacy of Schwappach's tables from differences in growth conditions, which have changed considerably during the 100 years since the yield tables were developed. Pretzsch et al. 32 estimated that in Central Europe, site productivity increased by approximately 30-70%. Additionally, the height growth of observed stands differs significantly from that observed 100 years ago 33,34 and included in the original tables. Moreover, the tables produce errors by not considering regional growth variability on a countrywide scale.
The research adopted the hypothesis that height growth is regionally differentiated. We assumed that height growth patterns are influenced by the variability in site conditions, which can be acknowledged by considering the specific effect of natural forest regions (NFE), distinguished by environmental conditions. Therefore, the research objective was to analyse the variability of height growth patterns between natural forest regions in Poland and develop height growth models for the Scots pine in Poland taking into account the specific effect of the region.

Materials and methods
Collection of the research material. The research material consists of the height growth trajectories of 855 dominant and codominant Scots pines ( Table 1). The most substantial part of growth trajectories was collected from the sample plots established within the project "Actual and potential site productivity in Poland for  Table 1). Natural forest regions 37 were adopted as the regionalisation unit in Poland. Natural forest regions were distinguished by the natural diversity of the country, in particular: variability of climatic and geological conditions, naturally occurring ranges of principal forest-forming tree species, presence and distribution of natural landscapes classes, and distribution of the primary units of potential natural vegetation 37 .
Stem analysis. Stem analysis was used to reconstruct the past height growth of the trees. The trees for stem analysis were selected in the direct neighbourhood of the permanent sample plots established in Scots pine stands. For each plot, tree diameter and height close to the top height (TH)-defined as the average height of the 100 thickest trees per 1 ha-were selected and cut for stem analysis (SA). Only trees with typically developed crowns and no visible damage or growth anomalies were selected for SA. After cutting, the total lengths of the trees were measured, and discs were collected for SA. Discs were taken from the base of the felled tree at the height of 0.5 m, and the breast height diameter (DBH) was measured; for trees over 15 m high, subsequent discs were collected at heights of 2.0, 4.0, and successive 2-m intervals to the top of the tree. For trees shorter than 15 m, subsequent discs were collected at heights of 1.5, 2.5, 3.5, and at 1-m intervals further up the tree. The course of growth of the individual trees was reconstructed based on the height of the discs and the number of annual discs. The growth trajectories of all collected trees were visually examined for suppression and release patterns and growth anomalies. As cross-section lengths do not coincide with periodic height growth, the height-age data from the SA were corrected using Carmean's method 38 . The collected growth series for six natural forest regions were used to develop regional height growth models for Scots pine.
Models and parameters estimation. To develop regional height growth models: nonlinear fixed-effects and nonlinear mixed-effects modelling approaches were used.
A nonlinear mixed-effects modelling approach. The growth series are characterised by a hierarchical structure, which contains information about individual trees and natural forest regions. The mixed-effects modelling approach is a possible solution to handle such data [39][40][41][42] . To develop the height growth model using a mixedeffects modelling approach, we tested three functions well known in forest growth modelling: Chapman-Richards, Schumacher and Hossfeld: where H is the top height (m), t is an age, and a 1 -a 3 are estimated parameters.
Given the value of the mean absolute error (MAE), adjusted coefficient of determination ( R 2 adj ), and the Akaike information criterion (AIC), the possibility of defining different random parameters was achieved, with a maximum of all (three) parameters found in the tested functions on both: trees and natural forest regions as grouping levels. In the case of heteroscedastic data, analysing likelihood ratio test and solutions containing powertype variance function with age as a predictor were applied 43 . Furthermore, given within-tree autocorrelation, the autoregressive correlation structure of order 1-AR(1) 44 and autoregressive-moving average models for the (1) Chapman-Richards: H = a 1 1 − e a 2 t a 3 (2) Schumacher: H = e a 1 +a 2 t a 3 (3) Hossfeld: H = a 1 t a 2 a 3 + t a 2−1 1/3 www.nature.com/scientificreports/ within-group errors was analysed using corAR1 and corARMA(10,0) function from nlme R package respectively 41 with RStudio software 45 .
Development of the models using nonlinear fixed-effect modelling approach with the functions developed using the algebraic difference approach. The algebraic difference approach (ADA) formulated by Bailey and Clutter 46 allows for the derivation of models in which the SI depends only on one parameter. Therefore it allows obtaining polymorphic models with single asymptotes or anamorphic models with multiple asymptotes. The generalised algebraic difference approach (GADA) developed by Cieszewski and Bailey 47 allows the use of multiple site parameters and models characterised by polymorphism and variable asymptotes for different sites. Height growth models should also show equality of the SI and height at base age and use the same function as the height growth and SI models. After initial preselection, three ADA models (M2, M4 and M5) and two GADA models (M1, M3) were chosen for further analysis ( Table 2). These models meet the criteria mentioned above and have been successfully applied in SI modelling, making them potential candidates for reliable SI models.
Local (site-specific) parameters of individual growth series and global parameters of SI models (M1-M5) were simultaneously fitted using the dummy variable approach (DVA), also known as the varying parameter method 53 . In the DVA, global and site-specific parameters are simultaneously estimated, resulting in the same parameter estimates regardless of the selected base age; therefore, it is a base-age invariant method and produces unbiased estimates for base-age invariant equation 54 . The DVA uses a starting value of height as the site-specific parameter for each growth series, assuming all particular growth series measurements have the same starting value. The dynamic model takes the following form Eq. (4).
where H is tree height, α SIi is the site-specific parameter for tree i, whose starting value for fitting is the observed SI value at the specified base age, d i is a dummy variable (being 1 for tree i or 0 for other trees), b is a vector for global parameters, T j is any age, T 0 is an initial condition, and ε ij is the error term. As a result of parameter estimation, an expression containing simulated variables and a site-specific parameter allows the calculation of a unique site parameter for each tree. Due to the longitudinal nature of growth series data, subsequent height observations from the same growth trajectory are significantly correlated, and the assumption of independent errors may be violated. Therefore, the error structured modelling approach with a linear k-order autoregressive error was used in parameter estimation (Eq. 5).
where ε ij is the error observed at the jth measurement on ith tree, φ k -are the k parameters to be estimated for the autoregressive process of order k, r k is a dummy variable equal to 1 when j > k and equal to zero, when j < k, and u ij is random noise.
The quality of the fit was evaluated using MAE, R 2 adj , and AIC. The criteria mentioned above, and the evaluation of the residuals homoscedasticity, which was carried out based on the distribution of residual values against the predicted values, were used to select the function best fitted to empirical data.
The potential improvement of height growth prediction was analysed by expanding the height growth model parameters as a function of regions. For the calculation of the full model, taking into account the specific effect of region, the vector of global ith model parameters (b i ) was expanded by the expansion term (δ iRj ), which is Table 2. The dynamic generalised algebraic difference approach (GADA) and algebraic difference approach (ADA) formulation of growth functions tested in height growth model development. Where H is the tree height, S site index, X and X 0 theoretical variable representing site conditions, T tree age, H 1 is the measured height at age T 1 , H 0 is a site parameter denoting stand height at age T 0 , and a 1 , a 2 , a 3 , a 4 , a 5 , b 1 , b 2 , and b 3 are model parameters.

Model no
Base model forms Parameter related to site Solution for theoretical variable X Dynamic ADA or GADA formulation and reference www.nature.com/scientificreports/ specific for the given jth region (Rj) and ith global parameter (i). Six dummy variables (Rj) representing regions were used to simultaneous fitting the expansion terms specific for each ith parameter (b i ) and jth region. The significance of the expansion terms was tested using t-statistics and was assumed as a statistical test for the difference between a given global parameter b i and the regional parameter values. Regional parameter values are the sum of the global parameters b i and expansion term (δ iRj ). Moreover, to compare the fits of the full (regional) and reduced (global) model, we used the anova() function with the regression objects as two separate arguments 41 . The anova() function takes model objects as arguments and returns an ANOVA to check that a more complex model is much better at capturing data than a simpler model.
The full model trajectories were directly compared by plotting the differences between observed and model trajectories. Parameter estimation was carried out in R using the procedures and suitably defined model form 45 .

Results
Nonlinear mixed-effects models. The range (0.359-0.710, Table 3) of the mean absolute error indicates that the application for the mixed-effects modelling approach when elaborating the height growth models allows obtaining high accuracy. The obtained results indicate that a better fit characterises the functions with defined two random parameters than one random parameter. The Chapman-Richards function with parameters a and b, which refer respectively to the asymptote and the growth rate, defined as random effects, was characterised by the best fit (Table 3). Note the shape of the function is expressed as a parameter c is estimated as a fixed-effect.
Moreover, analysis of the residuals for the best fitted Chapman-Richards model (estimated parameters shown in Table 4) indicated heteroscedasticity (Fig. 4). The assumed variance function did not model heteroscedasticity well. Likewise, the likelihood test indicated that the variance function has a nonsignificant influence on model fit. Alike autoregressive correlation structure of order 1-AR(1) autoregressive-moving average models for the within-group errors ARMA(10,0) has significant influence on model fit (p < 0.0001 for likelihood test). Whereas ARMA(10,0) solution allows to reach the best goodness-of-fit measures (Table 4).
Nonlinear fixed-effects models. All preselected functions showed a good fit to the data and explained (in most cases) close to 99% (adjusted R 2 ) of the height growth variation ( Table 5). The error structured modelling with the 10-order autoregressive error was used in parameter estimation. All developed models prediction errors were relatively low, with root mean absolute error (MAE) varying from 0.59 to 0.69 m.
Considering the largest part of the explained variance, the smallest average absolute error and the AIC value, the function M1 was selected as best fitted to the data. Next function M1 was used to develop the model taking into account the regional variability. For fitting of the full model, taking into account the specific effect of region, the parameters of the model M1 were expanded as a function of regions (Eq. 6). Table 3. Adjusted coefficient of determination ( R 2 adj ), the Akaike information criterion (AIC) and mean absolute error (MAE) for Chapman-Richards function with different random parameters on both: trees and natural forest regions as grouping levels (random effects are expressed with index i).  www.nature.com/scientificreports/ The extended full (regional) model explains 0.47% more residual variance, while all expansion terms are statistically significant (t < 0.001, Table 6). Besides, the ANOVA result indicates that more complex, the regional model is significantly better (p < 0.01) than the reduced (global) model and thus favour the regional model.
The residuals of the developed regional growth model are homoscedastic in the entire range of values predicted for individual trees and in individual regions (Fig. 2).  Table 6. Parameters of the regional model developed using a nonlinear fixed-effects modelling approach. Expansion terms (δ iRj ) expressing the difference in the values of ith parameters (b i , i = 1, 2, 3) of the Eq. (5) in jth regions (Rj, j = 1, 2, … 6). www.nature.com/scientificreports/ The developed regional model does not cause systematic errors in SI classes, age classes and tree height classes (Fig. 3). The mean residuals slightly deviate from 0 only in the least common SI below 15 m, and the equally rare age class above 120 years and the height class above 35 m (Fig. 3). www.nature.com/scientificreports/ Comparison of regional models developed with the use of nonlinear mixed-effects and nonlinear fixed-effects modelling approach. Based on the model fit statistics and graphical error analysis, it can be concluded that the regional model developed using the fixed-effects approach is characterised by better fit statistics (Table 7) and the distribution of errors (Fig. 4). Analysis of regional variability of height growth patterns based on a model developed using the fixed-effects modelling approach. The height growth patterns in individual regions show that it www.nature.com/scientificreports/ changes directionally from the north and northwest to south and southeast Poland (Fig. 5). In the NFR located in the northwestern (I, III) and northern (II) lowlands of Poland, the height increment rate is also maintained in older stands. In the case of NFR located in the highlands of southern and southeastern Poland, the height increment in old age is inhibited, and therefore the growth curves are more asymptotic (Fig. 5). Growth trajectories of models developed for individual regions were directly compared using the height at base age 40 as the reference level (Fig. 6). A direct comparison of the height growth trajectories for different SI values determined for the base age of 40 years shows a significant diversification of the height growth patterns between regions, especially in more fertile sites. Up to the base age of 40, the height growth in individual regions is very similar. However, above the base age, a considerable diversification of the growth pattern is observed, especially for most fertile sites with SI equal to 23 or 17 m. Intensive height increment persists for the longest in older stands in the lowlands of northwest and northern Poland, especially in NFR I and III, which are under the influence of the Atlantic climate. Scots pine NFR II located in northeastern Poland grows only slightly more slowly. In the stands growing in regions VI and IV located in the highlands of southeastern Poland, where the climate is continental, the growth is slowed down in older stands. Scots pines from the NRF V located in southern Poland lowlands show weaker growth in old age in the most fertile sites, but in the poorer, their growth dynamics persist until old age (Fig. 6).

Discussion
We compared the development of regional height growth models using nonlinear fixed-effects and nonlinear mixed-effects modelling approaches. Our results indicate a better fit to the data of the model developed using the nonlinear fixed-effects modelling approach. Comparison of fixed-effects and mixed-effects models was conducted by Nigh 40 who also found a better fit of models built with GADA. Similar results were also obtained by Cieszewski 55 , who demonstrated that self-referencing GADA models based on the nonlinear fixed-effects approach possess all the desirable properties associated with logical behaviour of the model and estimation statistics, while the nonlinear mixed-effects models lack the well-behaved model properties. The reason for the better fit of the models may be that the nonlinear fixed-effect models ar by definition more flexible than the nonlinear mixed-effect models 55 . Moreover, models can significantly overestimate variances based on untenable assumptions and have more restrictions imposed regarding the site effects' distributional properties. Cieszewski 55 points out that this is because the NME model fitting process imposes an additional constraint in that the random effect must conform to the arbitrary assumption of randomness, meaning that it must always have a greater or equal sum of square errors when the same models are fit into the same dataset as site models. Our results also confirm well-behaved properties of the nonlinear fixed-effects self-referencing models and their better suitability for modelling TH growth and SI compared to mixed-effects models.
The developed models allowed us to compare the differences in Scots pine height growth dynamics between natural forest regions in Poland. We found that height growth is regionally differentiated. The most relevant Table 7. Characterisation of the fit statistics of the regional models developed using mixed-effects and fixedeffects modelling approaches. www.nature.com/scientificreports/ differences in growth patterns are observed from the northwest to the southeast gradient. The shape of the growth trajectory for stands located in natural forest regions from the north and northwestern Poland (I-III) differs for younger and older trees. In these natural forest regions, the stands also reach higher absolute heights than those of southern regions. Our results show that the use of the regional model Poland reduces errors in the height growth prediction. Regional variation in growth patterns was previously reported; these variations have usually been interpreted as the result of variability in climate, geology, soil type, geographical location, and genotype [13][14][15] .
We found significant differences between the height growth patterns from NRF located in lowlands of northern-northwestern and northeastern Poland and highlands of southern-southeastern Poland. This may be because natural forest regions in northern Poland are under the influence of Atlantic climate and have deep podsolic, brown podsolic, and brown glacial soils, whereas natural forest regions VI and IV are under the influence of continental climate and grow on upland soils with less depth. The depth and physical properties of soil and the sand and clay contents lead to varying soil fertility and productivity 56,57 . Additionally, the availability of water is strongly correlated with the growth of stands and impacts the productivity of sites 58 . A linear increase in net primary production with increased water availability was observed at dry sites 59 . The amount of annual www.nature.com/scientificreports/ precipitation in Poland changes from south to north and, with soil fertility and depth, can explain the difference in height growth variability between the northern and southern regions. We found that growth trajectories for individual regions vary with age. The growth of stands in southern and southeastern Poland is slowing rapidly, whereas northern and northeastern stands have more continuous rapid height growth. The difference in growth pattern in regions IV, V, and VI observed in older age can be explained as the impact of soil depth. When soil depth is not a limiting factor in the youngest age classes, the impacts of climate and other factors are more important for growth dynamics, but with the transition to older age classes and increased interspecific competition, soil depth likely begins to limit the growth dynamic. An evident slowdown of growth dynamics in oldest age classes in natural forest regions IV and VI may be the result of a combination of shallower and less fertile podsolic and brown podsolic soils and low annual rainfall. In regions, I, II, and III, geological and climatic conditions are more favourable for Scots pines and likely result in a larger height increment in the older stands.
The significant height growth differences in different regions indicate that the use of one generalised height growth model for the entire country may result in the inappropriate estimation of site productivity and have direct forest management consequences. The importance of regional models may increase because the projections of climate change in Poland predict accentuated differences in site conditions between the northwest and southeast regions 60 . The obtained results concerning the spatial variability of Scots pines' height growth dynamics should be considered when making forest management decisions concerning the rotation age or intensity of silvicultural treatments.
The amount of carbon stored in forests is more significant than in any other terrestrial ecosystem 61 , and tree growth data play a significant role in carbon sequestration and biomass production calculations 62 . Several conversion methods have been developed for different types of forest to obtain reliable estimates [63][64][65] . However, our results show that not considering specific regional growth conditions and the resulting differentiated growth patterns by using only global models for these calculations may result in an over-or under-estimation of regional height growth, biomass production, and carbon sequestration. Many studies on the estimation of biomass production and carbon sequestration emphasise that the regional approach is promising 66,67 ; therefore, the use of models developed on a regional scale shows particular importance for effective forest management.
With the development of geographical information systems and technologies, tools such as the mapping SI are increasingly common 68 . Many problems regarding the stability of forest stands and disturbance of forests in Europe concern relatively small areas. Numerous studies found that SI mapping and analyses, (particularly on a small scale) provide information for a more dynamic forest growth assessment and should be an essential component of regional planning and management 69,70 . Therefore the use of advanced forestry solutions focused on smaller areas, regional growth models, and the analysis of local trends have great potential for the fast detection of problems and the precise formulation of answers to stand instability and forest disturbances.
We used the height growth trajectories of dominant Scots pines collected using the unified methodology to develop a regional model; however, collection methods and data processing in different regions should be carefully executed. Differences in methodology or data sources can lead to significant inconsistencies between growth models that are not based on ecological or biological causes 18 . A given species growth patterns in different ecological or geographical conditions are significant and should be determined before developing local models for small areas. The differences in local model predictions may not be large enough in small geographical regions to justify the development costs 18 . In this situation, smaller regions with low climatic and geomorphological variability should be grouped to develop a single model for larger areas.

Conclusions
We compared the development of regional height growth models using nonlinear-fixed-effects and nonlinearmixed-effects modelling approaches. Our results indicate a slightly better fit to the data of the model using the nonlinear fixed-effects approach. The developed models show differences in height growth patterns of Scots www.nature.com/scientificreports/ pines in Poland and revealed that acknowledgement of region as the independent variable could improve the growth prediction and quality of the SI estimation. The presented study showed regional differences in height growth patterns of Scots pines in Poland. Differences in climate and soil conditions that distinguish natural forest regions affect Scots pine height growth patterns. Therefore, extending this research to models that directly describe height growth interactions with site variables, such as climate, soil properties, and topography, can provide valuable forest management information.

Data availability
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.