Causes of variations of trace and rare earth elements concentration in lakes bottom sediments in the Bory Tucholskie National Park, Poland

The objective of this study was to analyse spatial variability of the trace elements (TEs) and rare earth elements (REEs) concentration in lake bottom sediments in Bory Tucholskie National Park (BTNP); Poland. The following research questions were posed: which factors have a fundamental impact on the concentration and spatial variability of elements in bottom sediments, which of the elements can be considered as indicators of natural processes and which are related to anthropogenic sources. The research material was sediments samples collected from 19 lakes. The concentrations of 24 TEs and 14 REEs were determined. The analyses were carried out using the inductively coupled plasma mass spectrometry (ICP-QQQ). Cluster analysis and principal component analysis were used to determine the spatial variability of the TEs and REEs concentrations, indicate the elements that are the indicators of natural processes and identify potential anthropogenic sources of pollution. The geochemical background value (GBV) calculations were made using 13 different statistical methods. However, the contamination of bottom sediments was evaluated by means of the index of geo-accumulation, the enrichment factor, the pollution load index, and the metal pollution index. The BTNP area is unique because of its isolation from the inflow of pollutants from anthropogenic sources and a very stable land use structure over the last 200 years. This study shows high variability of TE and REE concentrations in lake sediments. The values of geochemical indices suggest low pollution of lakes bottom sediments. It was found that TEs originated mainly from geogenic sources. However, the concentrations of Li, Ni, Sc, Se, Be, Se, Ag, Re, Tl, Cd, Sb and U may be related to the impact of point sources found mainly in the Ostrowite Lake. Almost all REEs concentrations were strongly correlated and their presence was linked to with geochemical processes. The elements allowing to identify natural processes and anthropogenic pollution sources were Cr, Co, Cu, Ag, Cd, Zn, Bi, Re, Ba, Al and Rb in TEs group and Nd, Gd, Yb, Lu, Eu, Dy and Ce in REEs group. The analysis shows high spatial variability of TE and REE concentrations in lake sediments. The values of geochemical indices point to low pollution of lakes sediments. The anthropogenic sources only for two lakes had an impact on concentrations of selected TEs and REEs. The analyses allowed to identify elements among TEs and REEs documenting geochemical processes and those indicating anthropogenic sources of pollution.

Scientific Reports | (2021) 11:244 | https://doi.org/10.1038/s41598-020-80137-z www.nature.com/scientificreports/ brown coal. This may be associated with a high supply of pollutants to the atmosphere, which may subsequently be deposited in surface waters in the form of dry or wet deposits. Lakes located in the BTNP are specific in this context. The BTNP is located in the central part of the Brda River within the area drained by the Struga Siedmiu Jezior River. The catchment area of the Struga Siedmiu Jezior covers 85.5% of the area of the BTNP and can therefore be considered unique. This is due to the total limitation of the inflow of pollutants to the lakes from anthropogenic sources via roads and a very stable land use structure (mainly forest) over the last 200 years 40 .
The aims of the study were to: (1) analyse TE and REE concentrations in bottom sediments of BTNP lakes, (2) determine GBV values for TEs and REEs, (3) assess bottom sediment pollution by TEs and REEs, (4) analyse the sources of individual elements in bottom sediments, (5) identify elements that may be indicators of natural and anthropogenic processes. Chemical analyses of bottom sediments in the area of the BTNP in such a wide range of lakes and parameters (TEs and REEs) are performed for the first time, and therefore this study may be a reference point for future geochemical surveys.

Material and methods
Study site. The Bory Tucholskie National Park (BTNP) was established in 1996. The BTNP is located in the central part of the Brda catchment area within the Struga Siedmiu Jezior catchment. The Struga Siedmiu Jezior stream flows into Charzykowskie Lake, which is located on the Brda River, the left tributary of the Vistula River, the largest river in Poland. The area of the BTNP is 46.13 km 2 . The most valuable parts of the BTNP are 24 lakes with a total area of 5.37 km 2 . The Ostrowite, Zielone, Jeleń, Bełczak, Główka, Płęsno, Skrzynka and Mielnica Lakes are located in the Struga Siedmiu Jezior stream (Fig. 1). The lakes Krzywce Wielkie, Błotko, Głuche and Krzywce Małe are connected by an artificial channel which periodically leads the water to the lake Skrzynka. Other lakes do not have direct contact with rivers. The total length of the BTNP watercourses is 12.4 km, which gives in relation to the catchment area a river network density of 0.27 km km −2 .
The study included 19 lakes (Table 1). Ostrowite Lake is the deepest lake in the BTNP with a maximum depth of 43 m, whereas Mielnica Lake with a maximum depth of 1.2 m is the shallowest. The lakes' surface area varies from 0.3 to 280.7 × 10 4 m 2 and volume from 5.5 × 10 3 to 29,989.8 × 10 3 m 3 . The ratio of the total area of the lake drainage basin and the area of the lake is from 6.2 to 544.2 for Ostrowite and Bełczak lakes, respectively. The lake coefficient indicates the potential amount of matter supply from the catchment area (Table 1). The BTNP includes oligotrophic, mesotrophic, eutrophic and dystrophic lakes 41,42 . According to the hydrological typology, they are flow-through, exorheic and endorheic lakes, and according to the thermal typology, the lakes are stratified, partially stratified and not stratified. The unique feature of the BTNP is the occurrence of five lobelian Sample collection and preparation. Samples of bottom sediments for chemical analyses were collected on 7 and 8 August 2017 from 19 lakes located in the BTNP area (Fig. 1). Samples were taken at the deepest point of each lake and in the case of Ostrowite Lake two samples were taken at the deepest points of the two western and eastern parts. Bottom sediment samples of 10 cm in depth were collected by a Nurek-1 sampler (Mera-Błonie, Gdańsk, Poland) into polyethylene (PE) containers. The sample numbers are presented in Table 1. The samples after transport to the laboratory were divided into two parts. The first part of the sediment was dried at Table 1. Basic parameters of lakes in the BTNP. A outwash plain sands and gravels; B aeolian sands; C peats; D aggradate muds; E silts, limnic sands and chalks; F fluvial sands and gravels; G boulder loams. a Woronko and Nowicka 43  Determination of GBV-comparison of different approaches. GBV calculations were made using 13 different statistical methods ( Table 2). The first method used is one of the most frequently described in the literature, which uses the mean value (Me) and standard deviation (SD)-M2SD (Me + 2SD). With respect to this method there are also many critical comments, which result from the assumption of a priori normality of the distribution of concentrations of analysed elements. In this study it was found that some of the TEs were normally (Al, V, Cr, Co, As, Ba, Re, Pb, Bi and Th) or log-normally distributed (Li, Be, Sc, Ni, Cu, Zn, Se, Rb, Mo, Ag, Cd, Sb and U). However, the REEs were log-normally distributed. For this reason, an attempt was made to modify the previously used method. For this reason the data were therefore log-transformed (log10(x)). On the basis of log-transformed data, the values of the Me + 2SD were calculated. Then the results were back-transformed-LTM2SD. Taking into account the other critical comments concerning the calculation of GBV by the M2SD method, it was decided to remove outliers identified by the Grubbs-Beck test. Then, on the basis of data excluding outliers, GBV calculations were made using a standard procedure based on Me and SD (ORM2SD) and in combination with logarithmic transformation (ORLTM2SD). Another way that is recommended when calculating GBV because of the occurrence of distributions other than normal and outlier observations is to use formulas based on the median value (Md). Therefore, the M2MAD method based on the median value (Md) and median absolute deviation (MAD) was used to determine GBV. Reimann et al. 21 and Salomão et al. 22 recommended applying LTM2MAD method based on the median value (Md) and median absolute deviation (MAD) with log-transformed dataset. In addition, the Tukey Boxplot method (Tukey inner fence, TIF) was used, which is based on the following values: 75th percentile (Q 3 ) and inter quartile range (IQR). The IQR is calculated as the difference between the 75th percentile and 25th percentile (Q 3 -Q 1 ) multiplied by the value 1.5 47 . Reimann et al. 21 recommended also to use LNTIF method, which is analogous to the TIF method, however, a log-transformed dataset are used. Moreover, for each TE and REE a cumulative distribution function (CDF) was developed, from which the values of the 90th, 95th and 98th percentile were obtained. These values were considered to be GBVs. The CDF was also used to find inflexion points. The lower inflexion point represents the upper limit of natural origin concentrations, and the higher inflexion point represents the lower limit of abnormal concentrations 51 . The inflexion points in a CDF were determined by examining the CDF vs. TEs and/log-TEs and log-REEs data under the linear regression model for P < 0.05 and R 2 > 0.95. The data of Li, Be, Sc, Ni, Cu, Zn, Se, Rb, Mo, Ag, Cd, Sb and U and REEs were therefore log-transformed prior to the CDF plotting. The TEs and REEs concentrations were plotted on the x-axis and the cumulative frequency was plotted on the y-axis. The maximum value was removed to obtain linearity of the data; this procedure was repeated until the above two criteria were met, and the maximum value within the regression line will represent the inflexion 51 . Finally the first point deviating from the regression line was chosen as the upper inflexion point, which was considered to be the baseline value for the selected TEs and REEs. In the last NLRR method was carried out to find the linear relationship between the contents of each TE and REE elements and reference elements 53,54 . Reference elements proposed by Wei and Wen 51 included Al, Fe, La, Li, Sc, Ti and V. In order to select the best reference elements, correlation analysis was carried out, and showed that for TEs the reference element is V and for REEs the reference element is La. The choice of these elements resulted from the degree of correlation between the individual TEs or REEs and the reference elements.

Assessment of bottom sediment contamination. The assessment of bottom sediment pollution by
TEs and REEs was performed on the basis of four commonly used indices: the index of geo-accumulation (I geo ), the enrichment factor (EF), the pollution load index (PLI), and the metal pollution index (MPI). I geo and EF are single indices, while PLI and MPI are integrated indices. For I geo , EF and PLI the concentrations of TEs and REEs in bottom sediments are related to the GBV. Formulas for calculating I geo , EF and PLI are widely used and can be found in the publications 57-59 . The limit values for the assessment of bottom sediment contamination have been assumed to be Martin and Meybeck 58 , Harikumar et al. 60 and El-Sayed et al. 61 .
The only index that does not use GBV is MPI. The MPI was used to compare the total TE and REE contents between different lakes. The MPI was calculated as follows: where C n are concentrations of TEs and REEs in the sample and i is the number of analyzed metals 62,63 . The GBV for TEs and REEs was assumed on the basis of our own research results. CDFIP methodology based on CDF and inflexion point was assumed as a reference. The calculated GBVs for lake bottom sediments in BTNP are presented in Table 4. In order to present the results of the I geo , EF, PLI and MPI indices, a Python language program (GeoVIS) was developed, which allows the automatic creation of heat maps.
Calculation of REE anomalies. The analysis of REE concentrations in bottom sediments started with the development of standardized graphs with respect to concentrations in chondrite 64 . The Eu, Gd and Yb anomalies were preliminarily identified on standardised graphs against REE concentrations in chondrite. Then the analysis of the REE concentration anomalies was performed. Anomalies in the REE concentration enable one to identify potential sources of pollution and the influence of natural processes on their concentration 32 . In order to calculate the Eu, Gd and Yb anomaly, the equations proposed by Benabdelkader et al. 65 were used. In the last stage the REE values were calculated for three groups: light (LREE-sum of La, Ce, Pr and Nd concentrations),

Statistical analysis.
The collected data were subjected to multivariate statistical analyses. The multivariate analysis was carried out on data that were transformed using centred log-ratio (clr). The clr transformation destroys the closure effect and 'opens' or 'un-constrains' the data to reveal inherent patterns in the data structure that can provide straight forward interpretation [66][67][68] . Cluster analysis (CA) was conducted to divide the lakes to the groups characterized by the highest similarity in terms of TE and REE concentrations in bottom sediments. CA analysis was performed using the Ward method and the Euclidean distance square as a measure of similarity. A dendrogram was used to visualize the results of the CA analysis. The results of clustering lakes on the basis of TE and REE concentrations in lake bottom sediments was presented using the standardized linkage distance (D link /D max ). The mean and median values were calculated for the groups and subgroups. To assess the statistical differences among groups of lakes taking into account concentrations of TEs and REEs in bottom sediments a non-parametric test of the variance (Kruskal-Wallis, K-W) and Dunn's tests as post hoc procedures (P ≤ 0.05) were performed.
In order to select the next method of statistical analysis, detrended component analysis (DCA) was carried out. DCA was conducted to assess the environmental factors' distribution pattern. The DCA indicated that the TE and REE concentrations and environmental factors were in linear distribution. Ter Braak and Smilauer 69 suggested that in such cases principal component analysis (PCA) is more appropriate for future analysis. The number of significant principal components (PC) was selected on the basis of a Kaiser criterion-eigenvalues higher than 1. The correlation between PC and TEs and REES was classified according 70

Results
In the first stage of the analysis, the values of minimum, mean, median, maximum and standard deviations were calculated. In addition, CDF were developed from which the values of percentiles 1, 5, 10, 25, 50, 75, 90, 95 and 99 were obtained (Table 3). Based on the mean values, the TE concentrations were ordered in the following ascending order: Tl < Se < Th < Ag < Sc < Bi < Be < Re < Sb < U < Co < Cd < Rb < Mo < Cr < As < Cu < Ni < Li < V < Pb < Ba < Z < Al. Among the analysed TEs the highest variability was observed in the concentrations of Se, Ag, Cd, Sb and U. The lowest variability was observed for Al, Ba and Th. On the other hand, the concentrations of REEs in the bottom sediments of lakes were in the following ascending order Lu < Tm < Ho < Tb < Yb < Yb < E u < Er < Dy < Sm < Pr < Gd < La < Nd < Ce. In the case of REEs the variability was lower than in TEs. The highest variability was observed for Er, Tm and Lu and the lowest for La, Pr, Nd, Ho and Yb.
The Grubbs-Beck test showed that there were 21 outlier observations in the TE dataset (Table 3). Outlier values were found in the Li, Be, Sc, Ni (two values), Cu (two values), Zn, As, Se (two values), Mo (two values), Ag (three values), Cd, Sb, Tl and U (two values). Ten outlier values were recorded in samples of bottom sediments from the western part of Ostrowite Lake (sampling point no. 1-Li, Be, Sc, Ni, Se, Ag, Cd, Sb, Tl and U), four outlier values in the eastern part of Ostrowite Lake (sampling point no. 2-Ni, Se, Ag and U), two outlier values in Gacno Wielkie Lake (Cu and Ag) and one outlier value in Krzywce Wielkie Lake (Cu), Zielone Lake (Mo), Gacno Małe Lake (Mo), Krzywce Małe Lake (As) and Głuche Lake (Zn) ( Table 3). REE analysis revealed 19 outliers in the data set. Among 14 REEs in 13 cases outliers were observed. Only in the case of Yb were no outliers observed. Most outlier observations were recorded in the sediment samples from the western part of Ostrowite Lake (no. 1-La, Ce, Pr, Pr, Nd, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Lu). Moreover, six outlier values (Eu, Gd, Tb, Ho, Er and Lu) were found in the eastern part of Ostrowite Lake (no. 2). Among the analysed TEs the following elements had normal distributions: Al, V, Cr, Co, As, Ba, Re, Pb, Bi and Th. The other TEs elements-Li, Be, Sc, Ni, Cu, Zn, Se, Rb, Mo, Ag, Cd, Sb and U-were log-normally distributed. The distributions were right-skewed; mean values (Me) were higher than median values (Md). All REEs were log-normally distributed.
Statistical analysis showed that in the TE and REE data set there are elements that are characterized by different variability and distributions. Moreover, there are also outliers among them. Taking this into account, it was decided to test different methods of GBV calculation in order to illustrate the uncertainty of such analyses. Moreover, it should be noted that the analysed lakes are located in a small area, which is separated from anthropogenic sources, including transgenic ones. The GBV results obtained by different methods are shown in Table 4. The number of values exceeding the set GBV values is given in brackets.
The descriptions of methods are presented in Table 2. The EGA values were obtained from the European Geochemical Atlas developed by Salminen et al. 56 for the BTNP location. In order to compare the results obtained with different methods, they were standardized in relation to the maximum value (Fig. 2).
The results showed that the differences between the minimum and maximum GBV values were for U 92.5% (the highest value) and for Th 43.0% (the lowest value). Average differences between the minimum and maximum GBV values within TEs and REEs were about 71%. The highest GBV values were obtained using LTTIF  In the case of REEs outlier values were found only in the western and eastern part of Ostrowite Lake 13 and 10 times, respectively. Taking into account individual elements, the greatest number of outliers was found in Sc, Co and Zn (five times), Li and Cr (four times) and V, Cu, Ag and Cd (three times). However, in the case of Al and Rb, outliers were not detected. Among REEs there were usually two outliers (ten times). Only in the case of Yb no outliers were observed. The standard approach to analysis using M2SD, M2MAD, LTM2MAD, TIF and LTTIF methods does not allow one to detect values that could potentially be associated with anthropogenic sources. Detailed statistical analysis allows one, on the one hand, to remove outliers (ORM2SD), transform variables to a normal distribution (LTM2SD) or perform the above operations together (ORLTM2SD), but still these methods do not allow one to detect suspicious observations. In the case of the NLRR method, which was to normalize TEs against V and REEs against La, the observations outside the 95% confidence interval were removed. These observations are considered to be atypical.  (14) Ag (

Rare earth elements-REEs
La (1) (2) (4). The results show that it is a method that does not always exclude outliers from the data set. These are rather atypical values resulting from the different geological structure of geochemical processes rather than from the influence of anthropogenic sources. In addition, calculations have shown that there is no relationship between V and Mo and Ba, and La and Yb. In this respect, another element for standardisation should be identified, which may have an impact on the homogeneity of GBV results. The results show that the calculation of the GBV value requires an individual approach for each element. Taking into account the above results, it was considered that the choice of the CDFIP method is the most appropriate one. On the one hand, lakes where bottom sediments may have been contaminated have been identified. On the other hand, data sets within individual elements have been reduced by up to five values associated with anthropogenic sources. GBV results calculated with the CDFIP method were used to determine the contamination of sediments by means of geochemical indices I geo , EF and PLI. Results obtained with the I geo index indicate that in most lakes the bottom sediments were unpolluted (Fig. 3a). In lakes Jeleń (no. 3), Olbrachta (no. 16), Kocioł (no. 17), Kociołek (no. 18) and Kacze Oko (no. 19) and Rybie Oko (no. 20) Igeo values for all elements were lower than 0, which indicates that they were not contaminated. For most of the other lakes, single values of the I geo index were higher (from 0 to 1), with the exceptions of Ostrowite (no. 1 and 2) and Krzywce Wielkie Lake (no. 5). In the eastern part of Ostrowite Lake (2) and Lake Krzywce Wielkie (2) the values of the I geo index from 0 to 1 were 13 times and nine times, respectively. The most polluted were the bottom sediments of the western part of Ostrowite Lake (no. 1), 19 values of the I geo index indicated their poor pollution (0 < I geo ≤ 1), seven values moderate pollution (1 < I geo ≤ 2) and in one case more than moderate pollution (I geo > 2). Taking into account particular elements, moderate pollution was found for Ag, Cd, U, Tb, Er and Lu. However, in the case of Se, pollution exceeded the level of moderate pollution one time. The results of the CF index in most cases corresponded to the results of the I geo index (Fig. 3b). However, the use of the CF index indicates slightly higher contamination of bottom sediments. The results showed that the most polluted were the sediments of the western part of Ostrowite Lake (1), and among 38 analysed elements as many as in 22 cases CF values indicated moderate pollution (1 < CF ≤ 3). Moreover, in seven cases the pollution was considerable (3 < CF ≤ 6) and in one case there was very high contamination (CF > 6). In the case of other lakes the CF index values were lower than considerably polluted and very highly polluted. For most lakes there were single Taking into account individual elements, the values of CF index above 1 were most often noted in the cases of Li and Cr (five times) and Sc, Co and Zn (six times). In the case of Ag, Cd, Tl, U, Tb, Er and Lu, one value each was found to show considerable pollution and in the case of Se one value was found to show very highly pollution. PLI index analysis showed that for TEs values ranged from 0.17 to 1.45 and for REEs from 0.09 to 1.96. PLI values above 1, which indicates sediment contamination for TEs, were found for the western part of Ostrowite Lake (no. 1) and Krzywce Wielkie Lake (no. 5), whereas for REEs values for both parts of Ostrowite Lake (nos. 1 and 2) were found (Fig. 3c). The MPI values ranged from 0.23 to 1.95 and from 0 to 0.413 for TEs and REEs respectively. In the case of TEs, the most polluted were bottom sediments of the western part of Ostrowite Lake (no. 1) and Krzywce Wielkie Lake (no. 5). Moreover, MPI values higher than 1 were recorded for the eastern part of Ostrowite Lake (2), Krzywce Małe Lake (10) and Głuche Lake (no. 15) (Fig. 3d). In the case of REEs, both parts of Ostrowite Lake (nos. 1 and 2) were the most polluted. The analysis of REEs concentration in bottom sediments of the BTNP lakes was started with the development of standardized graphs with respect to their concentrations in chondrite (Fig. 4). In terms of REEs the lakes were divided into three groups: (1) Ostrowite Lake-western and eastern part (Fig. 4a), (2) Kacze Oko, Rybie Oko, Kocioł and Kociołek lakes (Fig. 4b) and (3) other lakes (Fig. 4c,d) excluding Mielnica Lake. In the case of this reservoir, the standardized REE graph differed from all other lakes. Visual analysis of the graphs showed the occurrence of a negative Eu anomaly and a positive Gd anomaly in most lakes. Moreover, in Ostrowite Lake and additionally in Kocioł, Kacze Oko and Rybie Oko a negative anomaly of Yb and Tm was observed respectively. The formulas presented by Benabdelkader et al. 65 were used to confirm the above observations. It was assumed that the computations of Eu, Gd, Yb and Tm anomalies will have values above > 1.1 or below < 0.9, which indicates the occurrence of positive and negative anomalies, respectively. A value of 1 means that the element was not fractionated relative to the crustal composition 62 . The calculations confirmed the occurrence of negative Eu anomalies, except for Mielnica Lake. The values of Eu anomalies ranged from 0.30 to 0.84. A similar negative Eu anomaly was found in the bottom sediments of Lake Wadąg 32 . The lowest values were recorded for Kocioł and   (Fig. 5). This is confirmed by earlier studies by Sojka et al. 32 , obtained for Lake Wadąg located in north-eastern Poland. Benabdelkader et al. 65 demonstrated that reservoirs play an important role in REE retention (LREE) and to delay the release of MREE and HREE downstream. There was slightly more MREE in the individual lakes than HREE. Only in the case of Kociołek, Kacze Oko and Rybie Oko lakes were the MREE and HREE contents at a similar level. The (La/Yb) N ratio normalized in relation to chondrite in the bottom sediments of the BTNP lakes ranged from 7.7 to 160.4. The lowest (La/Yb) N ratio was observed in Lake Olbrachta (7.7) and the highest in the western and eastern parts of Ostrowite Lake (160.4 and 103.7, respectively). In the other lakes the values (La/Yb) N ranged from 9.6 to 23.4. The obtained values were generally lower than that obtained by Sultan and Shazili 71 which was about 20.6.
The CA analysis for TEs enabled the division of lakes into two main groups: TE-1 and TE-2 (Fig. 6a). Within each of the groups, two subgroups have been distinguished. The TE-1-1 subgroup includes western and eastern part of Ostrowite Lake with the highest concentrations of Li, Be, Sc, V, Cr, Co, Ni, Se, Ag, Cd, Sb, Re, Tl, Bi, Th and U. The lowest TE concentrations were found in the TE-2-2 subgroup, which included four lakes: Gacno Małe, Kocioł, Kociołek and Rybie Oko. The TE-1-2 subgroup includes seven lakes, connected by the Struga Siedmiu Jezior. The concentrations in the TE-1-2 subgroup were slightly lower than in the TE-2-1 subgroup, except for Li, Mo, U and Sc. It should be noted that the highest differences in TE concentration were found in the TE-2 group, which included 11 lakes. However, this is due to the lack of direct connection between the lakes and the various geological structure. In the case of REE concentrations in bottom sediments, lakes were divided into two groups: REE-1 and REE-2 (Fig. 6b). REE-1 includes both parts of Ostrowite Lake, in which REEs concentrations were at the highest level. However, within the REE-2 group REEs concentrations were lower, but there was a PCA analysis in the case of TEs allowed us to distinguish six significant components, whose eigenvalues were higher than 1. The components explain 83.10% of the internal data structure (Table 5). With the first PC1 component, Zn, Ba and Th concentrations were strongly negatively correlated and U concentrations were strongly positive correlated. Moreover, with PC1 the concentrations of Al, As and Rb and the concentration of Li, Cd, Re and Tl were respectively moderate negatively and positively correlated. A correlation with PC2 was noted with eight elements. For Pb and Bi strong negative correlation, Cd and Sb moderate negative correlation and for Be,  The PCA analysis of REEs allowed us to distinguish four main components which explain 83.08% of the data structure (Table 5). There were nine elements correlated with PC1 and six elements correlated with PC2. Whereas Eu, Ho and Sm were correlated with PC3 and PC4. The projection of the elements against the background of the four components is shown in Fig. 7c,d. Analysis of REEs against the background of PC1 and PC2 components allows for their division into LREEs and MREEs and HREEs. The exception is Yb and Eu and Ho. For REEs, their distribution in bottom sediments is mainly related to natural geochemical processes, which is confirmed by the concentrations and strong inter-group correlations. The only exception is Lake Ostrowite, where there were outlier values, which indicate their anthropogenic origin. The concentrations of Yb in bottom sediments show considerably different spatial variation from other REEs. There were no outliers within the Yb concentrations. However, both negative and positive anomalies of Yb concentrations were recorded in lakes. In the case of Eu, almost all lakes negative anomalies were observed.
In order to expose the elements best describing the state of contamination of bottom sediments, multiple regression analysis was carried out. Relationships between individual TEs or REEs (independent variables) and values of PLI and MPI indices were investigated. For TEs the calculations showed that for both PLI and MPI the best model with 12 elements-Cr, Co, Cu, Ag, Cd, Zn, Bi, Re, Ba, Al and Rb-was used. The models were statistically significant at 0.05 and the determination coefficient was 0.99. In the case of REEs the results showed that the model for PLI was based on seven elements (Nd, Gd, Yb, Lu, Eu, Dy and Ce) and in the case of MPI only one element (Dy). The determination coefficients for the models for PLI and MPI were 0.76 and 0.99, respectively.

Discussion
The studies indicate that among TEs and REE there were elements that were characterized by various variability, types of distributions and occurrence of outliers. The obtained results are somewhat unexpected as the studied lakes are located in a small area and were formed as a result of glacial processes of the last Central European glaciation. The surveyed area is located completely in the Struga Siedmiu Jezior catchment, which eliminates the inflow of pollutants from the external area. Moreover, the BTNP area is characterised by relatively stable forest land use in the last 120 years. Bottom sediment samples have also been taken from the deepest locations within each lake, away from the shore line. Most often higher concentrations are recorded in the littoral zone than in the profundal zone 32,72 . The studies presented in this paper should be considered as very valuable and may serve as a reference point for further geochemical studies in this area. Similar conclusions were drawn from the studies (Raut et al. 2017) 25 for shallow to medium lakes in high-altitude mountains, which may be important archives for studying various types of pollutants. An essential stage of geochemical analysis is the determination of the GBV value, which is used to assess the contamination of bottom sediments. Gałuszka and Migaszewski 14 indicated that only reliable GBVs enable one to establish objective environmental quality criteria. First of all, it is important to use GBV values calculated for a given region to obtain more reliable results 19,50,73 . Secondly, the choice of the method of estimating the GBV value itself is important. In this paper, 13 methods of GBV calculation www.nature.com/scientificreports/ were tested. The analysis showed that the uncertainty related to the choice of method for GBV determination, expressed as the difference between the minimum and the maximum value, is at the level of 70%. However, in extreme cases it can reach as high as 90%. Moreover, statistical methods, although to varying degrees they enable one to detect outliers, do not allow observations to be divided into groups which are connected with natural and anthropogenic processes. The results obtained by Reimann and Garrett 24 suggest that in the case of small data sets or irregular statistical distribution, methods based on median value should be used. Among the statistical methods Reimann et al. 21 recommends the mMAD and TIF methods with log-transformed data. The method that reliably allowed detection of observations different from the general population was the one based on the CDF on which the inflexion point is detected 74 . The higher inflexion point represents the lower limit of abnormal concentrations 51 . Similarly, Reimann and Caritat 50 recommend the method of identifying breaks in cumulative probability distributions for determining the GBV value, as well as the application of the Tukey inner fence and the 98th percentile. Zhou et al. 12 using the cumulative frequency method and the normalization method, obtained similar results. In this study, the application of the method of normalization (NLRR) to V and La caused that many so-called abnormal values were removed from the TE and REE set. The results obtained by NLRR and CDFIP methods differed even by more than 50% in the case of some elements. The calculated GBV values are used later to determine the degree of contamination of sediments by means of geochemical indices, which are calculated for each element (single indices) or for the whole set of TEs and REEs (integrated indices). GBV values enable one to assess possible anomalies and to provide information on the diffusion and dispersion patterns of pollutants inside the monitored area 53 . In addition, they may be useful for identifying the pollution source 15 . The results of Igeo, EF, PLI and MPI indices indicate that TE and REE concentrations in the bottom sediments of the www.nature.com/scientificreports/ BTNP lakes result mainly from natural geochemical processes. In individual cases, higher concentrations may indicate the occurrence of anthropogenic sources. Similar results were obtained by Raut et al. 25 , which indicated that only in the case of a few elements was the influence of anthropogenic factors, mainly long-range transport of atmospheric pollutants, observed. Strong correlation between REEs with the exception of Yb indicates the same source of origin of this group of elements. Slightly different results were obtained by Wang et al. 8 , who also observed strong inter-group correlations with the exception of Ce, Eu and Gd. The difference in results may be related to different degrees of anthropogenic source interactions. Another important element associated with geochemical analyses of bottom sediments of very large data sets is the division of samples into groups characterized by similar concentrations and identification of sources of elements in bottom sediments. Sahoo et al. 68 stated that the types of statistical processing and data manipulation are critical for understanding the role of various factors in the final geochemistry of the lake sediments. Multidimensional statistical methods are most frequently used for this purpose, among which cluster analysis and principal component analysis dominate 9,27-29 . Aitchison 66 , Filzmoser et al. 67 , Sahoo et al. 68 indicates that data transformation should be carried out before statistical analysis. The clr transformation destroys the closure effect and 'opens' or 'un-constrains' the data to reveal inherent patterns in the data structure that can provide straight forward interpretation. In addition, the selection of representative elements to characterise the contamination of bottom sediments is important to simplify the analysis of large data sets. As this study has shown, it is possible to apply multiple regression models, which allow one to identify relationships between the TEs and REEs and the geochemical indices of PLI and MPI.

Conclusion
The analysis of the study showed that: • the variation of TEs and REEs in the bottom sediments of the lakes located in the BTNP is mainly due to the specific geological structure of this area, which determines the geochemical processes, • anthropogenic sources only for Ostrowite and Krzywce Wielkie lakes had an impact on concentrations of selected TEs and REEs, • the calculation of the geochemical background value by different methods has an uncertainty of approximately 71% on average with variation of 43.0-92.5%, • the most reliable method to determine the geochemical background value and also to detect atypical values is the method based on CDF and the inflexion point, • the most effective elements for characterizing the pollution of bottom sediments of lakes in the BTNP among TEs are: Cr, Co, Cu, Ag, Cd, Zn, Bi, Re, Ba, Al and Rb, • the most effective elements for characterizing the pollution of bottom sediments of lakes in the BTNP among the REEs are: Nd, Gd, Yb, Lu, Eu, Dy and Ce, • the lakes in the BTNP can be divided into four and two subgroups with similar TE and REE contents, respectively.