## Introduction

Limnological research proves that natural processes influencing the process of lake disappearance have been disturbed as a result of human activity and climate change. The transformation of land use and land cover structure contributed to the acceleration of the circulation of matter and energy in these ecosystems, including an increase in the deposition of sediments in lakes. In recent years, the potential impact of climate change, including the occurrence of extreme precipitation, has been highlighted, which can increase erosion and therefore the supply of sediments and pollutants to lakes. With sediment, various elements are deposited in the lakes which come from both natural processes and anthropogenic sources1,2. The comprehensive geochemical surveys can provide important information for understanding modern sedimentary processes and source-catchment interactions in any given geological setting and how they control elemental distribution in lake sediments3. The main factors which controls the concentration and distribution of elements in lake sediments are the catchment lithology, the degree of parent material weathering, the ratio of organic and detrital sediments, the origin of organic and inorganic components, sedimentation, sorting, post-deposition/diagenetic processes, the proportion of atmospheric deposit, moreover lake bottom morphology and depth4,5. The geochemical characteristics of the lake sediments allow determine a number of factors and processes responsible for elements occurrence and distribution4. Inorganic compounds are a kind of archive of anthropogenic activities over the years6. Trace elements (TEs) are indicators that document the influence of anthropogenic activity on lakes in a very good way. Their use as an indicator of long-term impacts is justified due to the very long period of their presence in the environment and their relatively stable concentration. TEs are a specific type of pollution because they are not biodegradable and accumulate in different parts of aquatic ecosystems7. In the natural environment the concentrations of TEs in the bottom sediments are at a level similar to the one recorded in the parental material. On the other hand, the increase in the concentration of TEs above the values recorded in parent materials and change of ratios between individual TEs may indicate their origin from anthropogenic sources. The research carried out most often includes analyses of surface and cores of lake bottom sediments8 and in the rivers that flow into them9. Measurements of TE concentrations together with dating of individual layers of sediment samples allow one to trace the history of pollution10,11. The assessment of the contamination of bottom sediments with TEs is performed most often in relation to the geochemical background value (GBV)12. Gałuszka13 suggested that the GBV enables one to differentiate the natural quantity of given elements in the environment from the quantity originating from anthropogenic sources. The knowledge of the GBV is essential for: defining pollution, identifying the source of contamination and for establishing reliable environmental quality criteria for bottom sediments and surface waters14,15. Determination of GBV has been an important issue16,17. However, development of GBV is not straightforward as there is no single standard or recommended methodology18. Kicińska and Turek19 stated that three group of methods are widely used for establishing the GBV: direct, indirect and integrated methods. Direct methods are based on samples collected from unpolluted areas free from anthropogenic pressure. The GBV is calculated as the arithmetic mean value, the geometric mean value or the median for the data set. In indirect methods the GBV is calculated using statistical methods. The application of indirect methods requires a large number of samples representing a homogeneous environment, characterised by similar climatic and lithological conditions. The statistical methods are based mainly on the assumption that the natural concentration of elements in the environment follows the normal or the log-normal distribution. The GBV is calculated based on the arithmetic mean (Me) and standard deviation (SD). Reimann et al.20 suggest that this approach (assumption of normal distribution) is not always appropriate. For this reason, to calculate the GBV value they recommend using the median value (Md) and the median absolute deviation (MAD). In addition, the box plot method is used for calculations, which enables one to eliminate outliers from the data set. Among the statistical methods, mMAD (median ± 2 Median Absolute Deviation) and TIF (Tukey inner fence) with log-transformed data are most recommended21,22. Moreover, graphical methods are commonly used to calculate GBV. The most commonly used worldwide are the relative cumulative distribution function method and regression analysis method16,23. Sahoo et al.18 suggested that the cumulative frequency distribution plots can be used as an alternative method to define GBV by analysing points of inflection in the curve, but it is more subjective than the mMAD and TIF methods. The integrated method is a combination of direct and indirect methods. A critical review of the methods used to determine GBV was presented in their paper24. The GBV values are used to calculate geochemical indices, which allow one to answer a series of questions: What is the pollution level of bottom sediments? What are their potential eco-toxicological effects? What was or is the human pressure on the lake ecosystem25,26? There are many different approaches to evaluate the pollution status of TEs in sediments, such as: sediment quality guidelines (SQG), geo-accumulation index (Igeo), enrichment factor (EF), pollution load index (PLI), sediment pollution index (SPI) and sediment enrichment factor (SEF)23. Somewhat more problems are related to the analysis of TE sources in bottom sediments. For this purpose, research is carried out in order to identify sources of pollution. There is often an overlap between different sources of pollution27. Therefore, various statistical analyses are carried out to highlight the share of particular sources of pollution in the total transport of TEs to lakes28,29,30,31. Geochemical knowledge on the distribution and migration of elements is crucial to identify their origin14. Nevertheless, it is important to choose the more suitable method to determine GBV for avoiding any unrealistic assessment which may have serious consequences5.

In recent years, the use of new advanced technologies has resulted in a strong interest in rare earth elements (REEs)32. The main sources of REE include medical facilities, petroleum refining, mining and technology industries, fertilizers, livestock feeds, and electronic wastes and recycling plants33. Based on concentrations of REEs in bottom sediments geochemical parameters are calculated. They provide information about the provenance and moreover reflect the fractionation degree in the geochemical processes8. Moreover, the REE concentrations in sediments can be used to identify potential pollution sources34. Gwenzi et al.33 suggested that there is a lack of comprehensive reviews on the anthropogenic sources, environmental behaviour, and public and ecological health risks of REEs.

The relationship between natural and anthropogenic altered concentrations of chemical species involves many factors in the geosciences, environmental and biological sciences, toxicology, and other related disciplines13. The analysis of the literature on anthropogenic pressure on lakes shows that there are relatively few so-called anthropogenic-free lakes around the world. Lakes located in national parks could possibly be considered a small group of these. However, they are also supplied by rivers which are exposed to inflows of pollutants from the catchment area35,36,37,38. These are pollutants that are supplied transgenically and their reduction involves taking action to prevent them over very large areas. In Poland there are 23 national parks with 95 lakes39. Poor water status in lakes located in national parks results from their location within the range of intensive human pressure zone. On the other hand, the energy sector in Poland is largely based on the use of fossil fuels, i.e. black and brown coal. This may be associated with a high supply of pollutants to the atmosphere, which may subsequently be deposited in surface waters in the form of dry or wet deposits.

Lakes located in the BTNP are specific in this context. The BTNP is located in the central part of the Brda River within the area drained by the Struga Siedmiu Jezior River. The catchment area of the Struga Siedmiu Jezior covers 85.5% of the area of the BTNP and can therefore be considered unique. This is due to the total limitation of the inflow of pollutants to the lakes from anthropogenic sources via roads and a very stable land use structure (mainly forest) over the last 200 years40.

The aims of the study were to: (1) analyse TE and REE concentrations in bottom sediments of BTNP lakes, (2) determine GBV values for TEs and REEs, (3) assess bottom sediment pollution by TEs and REEs, (4) analyse the sources of individual elements in bottom sediments, (5) identify elements that may be indicators of natural and anthropogenic processes. Chemical analyses of bottom sediments in the area of the BTNP in such a wide range of lakes and parameters (TEs and REEs) are performed for the first time, and therefore this study may be a reference point for future geochemical surveys.

## Material and methods

### Study site

The Bory Tucholskie National Park (BTNP) was established in 1996. The BTNP is located in the central part of the Brda catchment area within the Struga Siedmiu Jezior catchment. The Struga Siedmiu Jezior stream flows into Charzykowskie Lake, which is located on the Brda River, the left tributary of the Vistula River, the largest river in Poland. The area of the BTNP is 46.13 km2. The most valuable parts of the BTNP are 24 lakes with a total area of 5.37 km2. The Ostrowite, Zielone, Jeleń, Bełczak, Główka, Płęsno, Skrzynka and Mielnica Lakes are located in the Struga Siedmiu Jezior stream (Fig. 1). The lakes Krzywce Wielkie, Błotko, Głuche and Krzywce Małe are connected by an artificial channel which periodically leads the water to the lake Skrzynka. Other lakes do not have direct contact with rivers. The total length of the BTNP watercourses is 12.4 km, which gives in relation to the catchment area a river network density of 0.27 km km−2.

The study included 19 lakes (Table 1). Ostrowite Lake is the deepest lake in the BTNP with a maximum depth of 43 m, whereas Mielnica Lake with a maximum depth of 1.2 m is the shallowest. The lakes’ surface area varies from 0.3 to 280.7 × 104 m2 and volume from 5.5 × 103 to 29,989.8 × 103 m3. The ratio of the total area of the lake drainage basin and the area of the lake is from 6.2 to 544.2 for Ostrowite and Bełczak lakes, respectively. The lake coefficient indicates the potential amount of matter supply from the catchment area (Table 1). The BTNP includes oligotrophic, mesotrophic, eutrophic and dystrophic lakes41,42. According to the hydrological typology, they are flow-through, exorheic and endorheic lakes, and according to the thermal typology, the lakes are stratified, partially stratified and not stratified. The unique feature of the BTNP is the occurrence of five lobelian lakes—Krzywce Wielkie, Gacno Małe, Gacno Wielkie, Nierybno and Głuche. The name “lobelian lakes” comes from the Latin name of the taxon—Lobelia dortmanna L. This species is very rare and hence has been included in the Polish Red Data Book of Plants.

The relief was formed during the period of the Baltic glaciation. The morphology is variable due to development of tunnel valleys, kettle holes and aeolian dunes44. The lowest point is located near Lake Mielnica (119.46 m a.s.l.), and the highest is within the range of hills located in the southern part of the catchment (148.75 m a.s.l.). The highest slopes occur in the hillside area of tunnel valleys and kettle holes. Sandy glacial sediments are the main parent material of the soils. The largest area is occupied by sands (90.2%), followed by loamy and loamy sands (7.0%). In addition, hydrogenic soils (2.8%) represented by mudslides and peats occur within the lake valley area. The lithological structure of the direct lake catchment areas is presented in detail in Table 1. The outwash plain sands and gravels or aeolian sands generally dominate in the direct lake catchment. On the other hand, in Krzywce Wielkie and Olbrachta lakes, peats have a high proportion, and in the catchment areas of Główka, Bełczak and Nierybnozna lakes, there are aggradate muds. One of the unique characteristics of the BTNP is its location, which corresponds 85.5% to the catchment area of the Struga Siedmiu Jezior stream. The lack of transitional rivers makes the lakes isolated from point and non-pollution sources. The atmospheric precipitation and dry deposition may be the only sources of pollution. The BTNP area is dominated by forest areas (86.8%) and water bodies (11.2%). A small part of the catchment is occupied by arable land and built-up areas (total 0.1%) and meadows and pastures (1.9%). The forests area in the BTNP has increased from 75.2 to 86.8% over the last 120 years, mainly due to the reduction of arable land from 9.8% to less than 0.1%. Arable land, meadows and a built-up area are located in the vicinity of the western part of Ostrowite Lake. In addition, there is a road network near the lakes which may be a potential source of pollution. The length of roads in the BTNP area is 116 km, which gives a density of 2.53 km km−2.

### Sample collection and preparation

Samples of bottom sediments for chemical analyses were collected on 7 and 8 August 2017 from 19 lakes located in the BTNP area (Fig. 1). Samples were taken at the deepest point of each lake and in the case of Ostrowite Lake two samples were taken at the deepest points of the two western and eastern parts. Bottom sediment samples of 10 cm in depth were collected by a Nurek-1 sampler (Mera-Błonie, Gdańsk, Poland) into polyethylene (PE) containers. The sample numbers are presented in Table 1. The samples after transport to the laboratory were divided into two parts. The first part of the sediment was dried at a temperature of 21 °C. Subsequently, the total organic matter (TOM) content was determined. The TOM was measured by the Tiurin method with wet oxidation, followed by ferrous ammonium sulphate titration45.

### Chemical analysis

The second part of the sediment samples were dried at 105 °C in a Binder FD 53 (Binder GmbH, Germany). Dry samples were mineralized with aqua regia prepared from HNO3:HCl (1:3 v/v) by Merck (Merck, Darmstadt, Germany). The samples were extracted using the MARS 5 Xpress (CEM, Matthews, NC, USA) microwave digestion system. The concentrations of TEs (Li, Be, Al, Sc, V, Cr, Co, Ni, Cu, Zn, As, Se, Rb, Mo, Ag, Cd, Sb, Ba, Pb, Bi, Th and U) and REEs (La, Ce, Pr, Nd, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb and Lu) in the bottom sediment samples with a fraction of < 2 mm were determined by inductively coupled plasma mass spectrometry (ICP-QQQ), model 8800 Triple Quad (Agilent Technologies, Japan). This instrument is equipped with a MicroMist nebulizer and a Peltier-cooled (2 °C) Scott-type spray chamber for sample introduction. This instrument contains an octopole-based collision/reaction cell, located in between two quadrupole analyzers. It was operated in a gas mode, with O2 flowing 0.3 mL/min (30%) and He flowing at 5 mL/min. The analytical procedure is described in previous papers2,29,31. The reagents used were analytically pure, and the water was purified to the resistivity of 18.2 MΩ.cm (at 25 °C) in a Direct-Q UV3 Ultrapure Water Systems apparatus (Millipore, France). During the determinations by the ICP-QQQ techniques, standard solutions produced by VHG Labs, Inc. (Manchester, England) were used. We measured the Certified Reference Material CRM no. LGC 6187 used for River sediments (Manchester, England), and high compliance with reference values was found (see Supplementary Table S1 online). MassHunter software for ICP-QQQ (Agilent Technologies, Japan) was used to control the instrument and to process the data29.

### TE and REE concentrations in bottom sediments

In the first stage the general characteristics of TE and REE concentrations were determined. For this purpose, the values of minimum, average, maximum concentration and standard deviation were calculated. Additionally, the percentile values were calculated for all elements (1%, 5%, 10%, 25%, 50%, 75%, 90%, 95% and 99%). Then TEs and REEs were analyzed for their variability, type of distribution and occurrence of outlier observations. The Grubbs-Beck test (G-B) was used to detect outlier observations. The distribution of TEs and REEs data was investigated using one-sample Kolmogorov–Smirnov test (or the K–S test of normality). The G–B and K–S tests were performed at the significance level of 0.05.

### Determination of GBV—comparison of different approaches

GBV calculations were made using 13 different statistical methods (Table 2).

The first method used is one of the most frequently described in the literature, which uses the mean value (Me) and standard deviation (SD)—M2SD (Me + 2SD). With respect to this method there are also many critical comments, which result from the assumption of a priori normality of the distribution of concentrations of analysed elements. In this study it was found that some of the TEs were normally (Al, V, Cr, Co, As, Ba, Re, Pb, Bi and Th) or log-normally distributed (Li, Be, Sc, Ni, Cu, Zn, Se, Rb, Mo, Ag, Cd, Sb and U). However, the REEs were log-normally distributed. For this reason, an attempt was made to modify the previously used method. For this reason the data were therefore log-transformed (log10(x)). On the basis of log-transformed data, the values of the Me + 2SD were calculated. Then the results were back-transformed—LTM2SD. Taking into account the other critical comments concerning the calculation of GBV by the M2SD method, it was decided to remove outliers identified by the Grubbs–Beck test. Then, on the basis of data excluding outliers, GBV calculations were made using a standard procedure based on Me and SD (ORM2SD) and in combination with logarithmic transformation (ORLTM2SD). Another way that is recommended when calculating GBV because of the occurrence of distributions other than normal and outlier observations is to use formulas based on the median value (Md). Therefore, the M2MAD method based on the median value (Md) and median absolute deviation (MAD) was used to determine GBV. Reimann et al.21 and Salomão et al.22 recommended applying LTM2MAD method based on the median value (Md) and median absolute deviation (MAD) with log-transformed dataset. In addition, the Tukey Boxplot method (Tukey inner fence, TIF) was used, which is based on the following values: 75th percentile (Q3) and inter quartile range (IQR). The IQR is calculated as the difference between the 75th percentile and 25th percentile (Q3Q1) multiplied by the value 1.547. Reimann et al.21 recommended also to use LNTIF method, which is analogous to the TIF method, however, a log-transformed dataset are used. Moreover, for each TE and REE a cumulative distribution function (CDF) was developed, from which the values of the 90th, 95th and 98th percentile were obtained. These values were considered to be GBVs. The CDF was also used to find inflexion points. The lower inflexion point represents the upper limit of natural origin concentrations, and the higher inflexion point represents the lower limit of abnormal concentrations51. The inflexion points in a CDF were determined by examining the CDF vs. TEs and/log-TEs and log-REEs data under the linear regression model for P < 0.05 and R2 > 0.95. The data of Li, Be, Sc, Ni, Cu, Zn, Se, Rb, Mo, Ag, Cd, Sb and U and REEs were therefore log-transformed prior to the CDF plotting. The TEs and REEs concentrations were plotted on the x-axis and the cumulative frequency was plotted on the y-axis. The maximum value was removed to obtain linearity of the data; this procedure was repeated until the above two criteria were met, and the maximum value within the regression line will represent the inflexion51. Finally the first point deviating from the regression line was chosen as the upper inflexion point, which was considered to be the baseline value for the selected TEs and REEs. In the last NLRR method was carried out to find the linear relationship between the contents of each TE and REE elements and reference elements53,54. Reference elements proposed by Wei and Wen51 included Al, Fe, La, Li, Sc, Ti and V. In order to select the best reference elements, correlation analysis was carried out, and showed that for TEs the reference element is V and for REEs the reference element is La. The choice of these elements resulted from the degree of correlation between the individual TEs or REEs and the reference elements. The linear relationships between each TE and V and REE and La were determined, in the form of y = ax + b55. As proposed by Covelli and Fontolan53 the data beyond 95% confidential intervals of the linear regression were removed as outliers. Thus, the developed linear regressions were used to calculate the GBV of TEs and REEs based on V and La concentrations respectively.

The calculated GBV values were compared with those provided by Salminen et al.56 in the European Geochemical Atlas. GBV values were obtained for the location corresponding to the BTNP.

### Assessment of bottom sediment contamination

The assessment of bottom sediment pollution by TEs and REEs was performed on the basis of four commonly used indices: the index of geo-accumulation (Igeo), the enrichment factor (EF), the pollution load index (PLI), and the metal pollution index (MPI). Igeo and EF are single indices, while PLI and MPI are integrated indices. For Igeo, EF and PLI the concentrations of TEs and REEs in bottom sediments are related to the GBV. Formulas for calculating Igeo, EF and PLI are widely used and can be found in the publications57,58,59. The limit values for the assessment of bottom sediment contamination have been assumed to be Martin and Meybeck58, Harikumar et al.60 and El-Sayed et al.61.

The only index that does not use GBV is MPI. The MPI was used to compare the total TE and REE contents between different lakes. The MPI was calculated as follows:

$${\text{MPI}} = \left( {{\text{C}}_{{{\text{n}}1}} \cdot {\text{C}}_{{{\text{n}}2}} \cdot \cdots \cdot {\text{C}}_{{{\text{ni}}}} } \right)^{{\frac{1}{{\text{i}}}}} ,$$

where Cn are concentrations of TEs and REEs in the sample and i is the number of analyzed metals62,63. The GBV for TEs and REEs was assumed on the basis of our own research results. CDFIP methodology based on CDF and inflexion point was assumed as a reference. The calculated GBVs for lake bottom sediments in BTNP are presented in Table 4. In order to present the results of the Igeo, EF, PLI and MPI indices, a Python language program (GeoVIS) was developed, which allows the automatic creation of heat maps.

### Calculation of REE anomalies

The analysis of REE concentrations in bottom sediments started with the development of standardized graphs with respect to concentrations in chondrite64. The Eu, Gd and Yb anomalies were preliminarily identified on standardised graphs against REE concentrations in chondrite. Then the analysis of the REE concentration anomalies was performed. Anomalies in the REE concentration enable one to identify potential sources of pollution and the influence of natural processes on their concentration32. In order to calculate the Eu, Gd and Yb anomaly, the equations proposed by Benabdelkader et al.65 were used. In the last stage the REE values were calculated for three groups: light (LREE—sum of La, Ce, Pr and Nd concentrations), medium (MREE—sum of Sm, Eu and Gd concentrations) and heavy (HREE—sum of Tb, Dy, Ho, Er, Tm, Yb and Lu concentrations).

### Statistical analysis

The collected data were subjected to multivariate statistical analyses. The multivariate analysis was carried out on data that were transformed using centred log-ratio (clr). The clr transformation destroys the closure effect and ‘opens’ or ‘un-constrains’ the data to reveal inherent patterns in the data structure that can provide straight forward interpretation66,67,68.

Cluster analysis (CA) was conducted to divide the lakes to the groups characterized by the highest similarity in terms of TE and REE concentrations in bottom sediments. CA analysis was performed using the Ward method and the Euclidean distance square as a measure of similarity. A dendrogram was used to visualize the results of the CA analysis. The results of clustering lakes on the basis of TE and REE concentrations in lake bottom sediments was presented using the standardized linkage distance (Dlink/Dmax). The mean and median values were calculated for the groups and subgroups. To assess the statistical differences among groups of lakes taking into account concentrations of TEs and REEs in bottom sediments a non-parametric test of the variance (Kruskal–Wallis, K–W) and Dunn’s tests as post hoc procedures (P ≤ 0.05) were performed.

In order to select the next method of statistical analysis, detrended component analysis (DCA) was carried out. DCA was conducted to assess the environmental factors’ distribution pattern. The DCA indicated that the TE and REE concentrations and environmental factors were in linear distribution. Ter Braak and Smilauer69 suggested that in such cases principal component analysis (PCA) is more appropriate for future analysis. The number of significant principal components (PC) was selected on the basis of a Kaiser criterion—eigenvalues higher than 1. The correlation between PC and TEs and REES was classified according70. The correlation coefficient values > 0.75, 0.75–0.50 and 0.50–0.30 indicate strong, moderate and weak relationships. In order to identify the elements that best allow for the assessment of the contamination of bottom sediments, the multiple regression method was applied. Input data were all TEs and REEs (independent variables) and values of aggregated indexes PLI and MPI (dependent variables). The definition of the multiple regression model was carried out using the progressive stepwise method. The elements which were included in the model and were significant at the level of 0.05 were selected as relevant for the characteristics of bottom sediment contamination.

The Grubbs–Beck test, Kruskal–Wallis (K–W) and Dunn’s tests, Kolmogorov–Smirnov test and cluster analysis (CA) were carried out using STATISTICA software version 13.1 (Dell, Inc. 2016, Dell Statistica). Canoco software version 5.0 (Microcomputer Power, Ithaca) was used for the PCA and DCA. The clr transformation was carried out using open source statistical software R version 3.6.1 (R Development Core Team, 2009).

## Results

In the first stage of the analysis, the values of minimum, mean, median, maximum and standard deviations were calculated. In addition, CDF were developed from which the values of percentiles 1, 5, 10, 25, 50, 75, 90, 95 and 99 were obtained (Table 3). Based on the mean values, the TE concentrations were ordered in the following ascending order: Tl < Se < Th < Ag < Sc < Bi < Be < Re < Sb < U < Co < Cd < Rb < Mo < Cr < As < Cu < Ni < Li < V < Pb < Ba < Z < Al. Among the analysed TEs the highest variability was observed in the concentrations of Se, Ag, Cd, Sb and U. The lowest variability was observed for Al, Ba and Th. On the other hand, the concentrations of REEs in the bottom sediments of lakes were in the following ascending order Lu < Tm < Ho < Tb < Yb < Yb < Eu < Er < Dy < Sm < Pr < Gd < La < Nd < Ce. In the case of REEs the variability was lower than in TEs. The highest variability was observed for Er, Tm and Lu and the lowest for La, Pr, Nd, Ho and Yb.

The Grubbs–Beck test showed that there were 21 outlier observations in the TE dataset (Table 3). Outlier values were found in the Li, Be, Sc, Ni (two values), Cu (two values), Zn, As, Se (two values), Mo (two values), Ag (three values), Cd, Sb, Tl and U (two values). Ten outlier values were recorded in samples of bottom sediments from the western part of Ostrowite Lake (sampling point no. 1—Li, Be, Sc, Ni, Se, Ag, Cd, Sb, Tl and U), four outlier values in the eastern part of Ostrowite Lake (sampling point no. 2—Ni, Se, Ag and U), two outlier values in Gacno Wielkie Lake (Cu and Ag) and one outlier value in Krzywce Wielkie Lake (Cu), Zielone Lake (Mo), Gacno Małe Lake (Mo), Krzywce Małe Lake (As) and Głuche Lake (Zn) (Table 3). REE analysis revealed 19 outliers in the data set. Among 14 REEs in 13 cases outliers were observed. Only in the case of Yb were no outliers observed. Most outlier observations were recorded in the sediment samples from the western part of Ostrowite Lake (no. 1—La, Ce, Pr, Pr, Nd, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Lu). Moreover, six outlier values (Eu, Gd, Tb, Ho, Er and Lu) were found in the eastern part of Ostrowite Lake (no. 2). Among the analysed TEs the following elements had normal distributions: Al, V, Cr, Co, As, Ba, Re, Pb, Bi and Th. The other TEs elements—Li, Be, Sc, Ni, Cu, Zn, Se, Rb, Mo, Ag, Cd, Sb and U—were log-normally distributed. The distributions were right-skewed; mean values (Me) were higher than median values (Md). All REEs were log-normally distributed.

Statistical analysis showed that in the TE and REE data set there are elements that are characterized by different variability and distributions. Moreover, there are also outliers among them. Taking this into account, it was decided to test different methods of GBV calculation in order to illustrate the uncertainty of such analyses. Moreover, it should be noted that the analysed lakes are located in a small area, which is separated from anthropogenic sources, including transgenic ones. The GBV results obtained by different methods are shown in Table 4. The number of values exceeding the set GBV values is given in brackets.

The descriptions of methods are presented in Table 2. The EGA values were obtained from the European Geochemical Atlas developed by Salminen et al.56 for the BTNP location. In order to compare the results obtained with different methods, they were standardized in relation to the maximum value (Fig. 2).

The results showed that the differences between the minimum and maximum GBV values were for U 92.5% (the highest value) and for Th 43.0% (the lowest value). Average differences between the minimum and maximum GBV values within TEs and REEs were about 71%. The highest GBV values were obtained using LTTIF method, while the lowest values were obtained using CDFIP method. Using the LTTIF method, very high GBV values were obtained. For 31 elements, their concentrations in bottom sediment were lower than GBV. For Be, Ni, Gd, Tb and Er only 1 value exceeded GBV value and for Se and Ag only 2 values. Also high GBV values were obtained using LTM2SD method. For 27 elements the maximum concentrations did not exceed GBV value and for 11 elements only one value was higher than GBV. Using LTM2SD methods, GBV values were on average 30% higher than those of the 98th percentile. The lowest GBV values were obtained using the CDFIP method. In this case, GBV values were on average about 10% lower than the value corresponding to the 90th percentile. However, for Sc, Cr, Co, Cu, Zn, Ag and Cd the GBV values obtained by means of the CDFIP method were over 30% lower than those corresponding to the 90th percentile. In case of small research objects such as BTNP it is very important to thoroughly analyse the whole set of measurement data. From the methodological point of view the CDFIP method is the most suitable one, in which the inflexion point is detected. This point enables indication within a series of data of those potentially related to anthropogenic sources. The application of a method based on CDF enables detection of such observations, which differs from the others. The analysis showed that among TEs such elements were detected in samples taken from the western part of Ostrowite Lake (16 times), Krzywce Wielkie (ten times), the eastern part of Ostrowite Lake (seven times), Głuche (six times), Krzywce Małe (four times), Bełczak (three times), Płęsno, Gacno Wielkie and Główka (two times) and Jeleń, Zielone, skrzynka, Gacno Małe, Mielnica i Nierybno (one times). There were no outliers when analyzing CDF for TEs in the case of Olbrachta, Kocioł, Kociołek, Kacze Oko and Rybie Oko lakes. In the case of REEs outlier values were found only in the western and eastern part of Ostrowite Lake 13 and 10 times, respectively. Taking into account individual elements, the greatest number of outliers was found in Sc, Co and Zn (five times), Li and Cr (four times) and V, Cu, Ag and Cd (three times). However, in the case of Al and Rb, outliers were not detected. Among REEs there were usually two outliers (ten times). Only in the case of Yb no outliers were observed. The standard approach to analysis using M2SD, M2MAD, LTM2MAD, TIF and LTTIF methods does not allow one to detect values that could potentially be associated with anthropogenic sources. Detailed statistical analysis allows one, on the one hand, to remove outliers (ORM2SD), transform variables to a normal distribution (LTM2SD) or perform the above operations together (ORLTM2SD), but still these methods do not allow one to detect suspicious observations. In the case of the NLRR method, which was to normalize TEs against V and REEs against La, the observations outside the 95% confidence interval were removed. These observations are considered to be atypical. When NLRR was used, significantly more atypical values were identified among TEs. For western and eastern parts of Ostrowite Lake, Jeleń, Płęsno, Krzywce Małe, Nierybno, Główka, Bełczak, Głuche and Olbrachta from 10 to 18 atypical parameters were identified out of 24 examined. There were no lakes without atypical values among TEs. In the case of REEs, 5–10 atypical values were detected in lakes Ostrowite, Krzywce Wielkie, Gacno Małe, Krzywce Małe, Bełczak, Głuche and Olbrachta. Only in the case of Lake Gacno Wielkie were no atypical values observed among REEs. Taking into account individual elements, the most atypical values were found among Al (12), Pb, Th and Gd (11) and Sc, Re, Bi and U (10). The least atypical values were identified within Pr (3) and Cr and Nd (4). The results show that it is a method that does not always exclude outliers from the data set. These are rather atypical values resulting from the different geological structure of geochemical processes rather than from the influence of anthropogenic sources. In addition, calculations have shown that there is no relationship between V and Mo and Ba, and La and Yb. In this respect, another element for standardisation should be identified, which may have an impact on the homogeneity of GBV results. The results show that the calculation of the GBV value requires an individual approach for each element. Taking into account the above results, it was considered that the choice of the CDFIP method is the most appropriate one. On the one hand, lakes where bottom sediments may have been contaminated have been identified. On the other hand, data sets within individual elements have been reduced by up to five values associated with anthropogenic sources. GBV results calculated with the CDFIP method were used to determine the contamination of sediments by means of geochemical indices Igeo, EF and PLI.

Results obtained with the Igeo index indicate that in most lakes the bottom sediments were unpolluted (Fig. 3a). In lakes Jeleń (no. 3), Olbrachta (no. 16), Kocioł (no. 17), Kociołek (no. 18) and Kacze Oko (no. 19) and Rybie Oko (no. 20) Igeo values for all elements were lower than 0, which indicates that they were not contaminated. For most of the other lakes, single values of the Igeo index were higher (from 0 to 1), with the exceptions of Ostrowite (no. 1 and 2) and Krzywce Wielkie Lake (no. 5). In the eastern part of Ostrowite Lake (2) and Lake Krzywce Wielkie (2) the values of the Igeo index from 0 to 1 were 13 times and nine times, respectively. The most polluted were the bottom sediments of the western part of Ostrowite Lake (no. 1), 19 values of the Igeo index indicated their poor pollution (0 < Igeo ≤ 1), seven values moderate pollution (1 < Igeo ≤ 2) and in one case more than moderate pollution (Igeo > 2). Taking into account particular elements, moderate pollution was found for Ag, Cd, U, Tb, Er and Lu. However, in the case of Se, pollution exceeded the level of moderate pollution one time. The results of the CF index in most cases corresponded to the results of the Igeo index (Fig. 3b). However, the use of the CF index indicates slightly higher contamination of bottom sediments. The results showed that the most polluted were the sediments of the western part of Ostrowite Lake (1), and among 38 analysed elements as many as in 22 cases CF values indicated moderate pollution (1 < CF ≤ 3). Moreover, in seven cases the pollution was considerable (3 < CF ≤ 6) and in one case there was very high contamination (CF > 6). In the case of other lakes the CF index values were lower than considerably polluted and very highly polluted. For most lakes there were single elements, which indicated moderate contamination. A higher number of elements indicated moderate contamination in the case of the eastern part of lakes Ostrowite (no. 2)—22 elements, Krzywce Wielkie (no. 5)—23 elements, Głuche (no. 15)—nine elements and Krzywce Małe (no. 10) and Nierybno (no. 12)—seven elements. Taking into account individual elements, the values of CF index above 1 were most often noted in the cases of Li and Cr (five times) and Sc, Co and Zn (six times). In the case of Ag, Cd, Tl, U, Tb, Er and Lu, one value each was found to show considerable pollution and in the case of Se one value was found to show very highly pollution. PLI index analysis showed that for TEs values ranged from 0.17 to 1.45 and for REEs from 0.09 to 1.96. PLI values above 1, which indicates sediment contamination for TEs, were found for the western part of Ostrowite Lake (no. 1) and Krzywce Wielkie Lake (no. 5), whereas for REEs values for both parts of Ostrowite Lake (nos. 1 and 2) were found (Fig. 3c). The MPI values ranged from 0.23 to 1.95 and from 0 to 0.413 for TEs and REEs respectively. In the case of TEs, the most polluted were bottom sediments of the western part of Ostrowite Lake (no. 1) and Krzywce Wielkie Lake (no. 5). Moreover, MPI values higher than 1 were recorded for the eastern part of Ostrowite Lake (2), Krzywce Małe Lake (10) and Głuche Lake (no. 15) (Fig. 3d). In the case of REEs, both parts of Ostrowite Lake (nos. 1 and 2) were the most polluted.

The analysis of REEs concentration in bottom sediments of the BTNP lakes was started with the development of standardized graphs with respect to their concentrations in chondrite (Fig. 4). In terms of REEs the lakes were divided into three groups: (1) Ostrowite Lake—western and eastern part (Fig. 4a), (2) Kacze Oko, Rybie Oko, Kocioł and Kociołek lakes (Fig. 4b) and (3) other lakes (Fig. 4c,d) excluding Mielnica Lake. In the case of this reservoir, the standardized REE graph differed from all other lakes. Visual analysis of the graphs showed the occurrence of a negative Eu anomaly and a positive Gd anomaly in most lakes. Moreover, in Ostrowite Lake and additionally in Kocioł, Kacze Oko and Rybie Oko a negative anomaly of Yb and Tm was observed respectively. The formulas presented by Benabdelkader et al.65 were used to confirm the above observations. It was assumed that the computations of Eu, Gd, Yb and Tm anomalies will have values above > 1.1 or below < 0.9, which indicates the occurrence of positive and negative anomalies, respectively. A value of 1 means that the element was not fractionated relative to the crustal composition62. The calculations confirmed the occurrence of negative Eu anomalies, except for Mielnica Lake. The values of Eu anomalies ranged from 0.30 to 0.84. A similar negative Eu anomaly was found in the bottom sediments of Lake Wadąg32. The lowest values were recorded for Kocioł and Rybie Oko lakes 0.30 and 0.42, respectively. A positive Gd anomaly was found for all lakes in the range from 1.28 to 1.71. Yb analysis showed a very strong negative anomaly for Ostrowite Lake (0.05 and 0.10) and a very strong positive anomaly for Kociołek, Kacze Oko and Rybie Oko lakes (from 2.34 to 2.76). In the case of Kociołek and Rybie Oko lakes there were negative Tm anomalies of 0.14 and 0.26, respectively.

REE analysis in particular groups showed that LREE dominated over MREE and HREE in bottom sediments (Fig. 5). This is confirmed by earlier studies by Sojka et al.32, obtained for Lake Wadąg located in north-eastern Poland. Benabdelkader et al.65 demonstrated that reservoirs play an important role in REE retention (LREE) and to delay the release of MREE and HREE downstream. There was slightly more MREE in the individual lakes than HREE. Only in the case of Kociołek, Kacze Oko and Rybie Oko lakes were the MREE and HREE contents at a similar level. The (La/Yb)N ratio normalized in relation to chondrite in the bottom sediments of the BTNP lakes ranged from 7.7 to 160.4. The lowest (La/Yb)N ratio was observed in Lake Olbrachta (7.7) and the highest in the western and eastern parts of Ostrowite Lake (160.4 and 103.7, respectively). In the other lakes the values (La/Yb)N ranged from 9.6 to 23.4. The obtained values were generally lower than that obtained by Sultan and Shazili71 which was about 20.6.

The CA analysis for TEs enabled the division of lakes into two main groups: TE-1 and TE-2 (Fig. 6a). Within each of the groups, two subgroups have been distinguished. The TE-1–1 subgroup includes western and eastern part of Ostrowite Lake with the highest concentrations of Li, Be, Sc, V, Cr, Co, Ni, Se, Ag, Cd, Sb, Re, Tl, Bi, Th and U. The lowest TE concentrations were found in the TE-2–2 subgroup, which included four lakes: Gacno Małe, Kocioł, Kociołek and Rybie Oko. The TE-1–2 subgroup includes seven lakes, connected by the Struga Siedmiu Jezior. The concentrations in the TE-1–2 subgroup were slightly lower than in the TE-2–1 subgroup, except for Li, Mo, U and Sc. It should be noted that the highest differences in TE concentration were found in the TE-2 group, which included 11 lakes. However, this is due to the lack of direct connection between the lakes and the various geological structure. In the case of REE concentrations in bottom sediments, lakes were divided into two groups: REE-1 and REE-2 (Fig. 6b). REE-1 includes both parts of Ostrowite Lake, in which REEs concentrations were at the highest level. However, within the REE-2 group REEs concentrations were lower, but there was a relatively large variation between individual lakes. In general, the lowest REEs concentrations were found in Mielnica, Kociołek, Kociołek, Kacze Oko and Rybie Oko lakes.

PCA analysis in the case of TEs allowed us to distinguish six significant components, whose eigenvalues were higher than 1. The components explain 83.10% of the internal data structure (Table 5). With the first PC1 component, Zn, Ba and Th concentrations were strongly negatively correlated and U concentrations were strongly positive correlated. Moreover, with PC1 the concentrations of Al, As and Rb and the concentration of Li, Cd, Re and Tl were respectively moderate negatively and positively correlated. A correlation with PC2 was noted with eight elements. For Pb and Bi strong negative correlation, Cd and Sb moderate negative correlation and for Be, Ni and Se moderate positive correlation. Al, Sc, Cr, Co, Rb and Ag (PC3), Mo and Sb (PC4), V and As (PC5) and Cu (PC6) were correlated with subsequent components respectively. The projection of the elements against the background of the four components is shown in Fig. 7a,b. The concentrations of Li, Ni, Sc, Se, Be, Se, Ag, Re, Tl, Cd, Sb and U may be related to the impact of anthropogenic point sources found mainly in the Ostrowite Lake. However, Cu and Mo concentrations are associated with point sources of pollution occurring in other lakes. The other TEs comes mainly from geogenic sources. This is confirmed by the distribution of concentrations of these elements (generally normal) and the absence of outliers.

The PCA analysis of REEs allowed us to distinguish four main components which explain 83.08% of the data structure (Table 5). There were nine elements correlated with PC1 and six elements correlated with PC2. Whereas Eu, Ho and Sm were correlated with PC3 and PC4. The projection of the elements against the background of the four components is shown in Fig. 7c,d. Analysis of REEs against the background of PC1 and PC2 components allows for their division into LREEs and MREEs and HREEs. The exception is Yb and Eu and Ho. For REEs, their distribution in bottom sediments is mainly related to natural geochemical processes, which is confirmed by the concentrations and strong inter-group correlations. The only exception is Lake Ostrowite, where there were outlier values, which indicate their anthropogenic origin. The concentrations of Yb in bottom sediments show considerably different spatial variation from other REEs. There were no outliers within the Yb concentrations. However, both negative and positive anomalies of Yb concentrations were recorded in lakes. In the case of Eu, almost all lakes negative anomalies were observed.

In order to expose the elements best describing the state of contamination of bottom sediments, multiple regression analysis was carried out. Relationships between individual TEs or REEs (independent variables) and values of PLI and MPI indices were investigated. For TEs the calculations showed that for both PLI and MPI the best model with 12 elements—Cr, Co, Cu, Ag, Cd, Zn, Bi, Re, Ba, Al and Rb—was used. The models were statistically significant at 0.05 and the determination coefficient was 0.99. In the case of REEs the results showed that the model for PLI was based on seven elements (Nd, Gd, Yb, Lu, Eu, Dy and Ce) and in the case of MPI only one element (Dy). The determination coefficients for the models for PLI and MPI were 0.76 and 0.99, respectively.

## Discussion

The studies indicate that among TEs and REE there were elements that were characterized by various variability, types of distributions and occurrence of outliers. The obtained results are somewhat unexpected as the studied lakes are located in a small area and were formed as a result of glacial processes of the last Central European glaciation. The surveyed area is located completely in the Struga Siedmiu Jezior catchment, which eliminates the inflow of pollutants from the external area. Moreover, the BTNP area is characterised by relatively stable forest land use in the last 120 years. Bottom sediment samples have also been taken from the deepest locations within each lake, away from the shore line. Most often higher concentrations are recorded in the littoral zone than in the profundal zone32,72. The studies presented in this paper should be considered as very valuable and may serve as a reference point for further geochemical studies in this area. Similar conclusions were drawn from the studies (Raut et al. 2017)25 for shallow to medium lakes in high-altitude mountains, which may be important archives for studying various types of pollutants. An essential stage of geochemical analysis is the determination of the GBV value, which is used to assess the contamination of bottom sediments. Gałuszka and Migaszewski14 indicated that only reliable GBVs enable one to establish objective environmental quality criteria. First of all, it is important to use GBV values calculated for a given region to obtain more reliable results19,50,73. Secondly, the choice of the method of estimating the GBV value itself is important. In this paper, 13 methods of GBV calculation were tested. The analysis showed that the uncertainty related to the choice of method for GBV determination, expressed as the difference between the minimum and the maximum value, is at the level of 70%. However, in extreme cases it can reach as high as 90%. Moreover, statistical methods, although to varying degrees they enable one to detect outliers, do not allow observations to be divided into groups which are connected with natural and anthropogenic processes. The results obtained by Reimann and Garrett24 suggest that in the case of small data sets or irregular statistical distribution, methods based on median value should be used. Among the statistical methods Reimann et al.21 recommends the mMAD and TIF methods with log-transformed data. The method that reliably allowed detection of observations different from the general population was the one based on the CDF on which the inflexion point is detected74. The higher inflexion point represents the lower limit of abnormal concentrations51. Similarly, Reimann and Caritat50 recommend the method of identifying breaks in cumulative probability distributions for determining the GBV value, as well as the application of the Tukey inner fence and the 98th percentile. Zhou et al.12 using the cumulative frequency method and the normalization method, obtained similar results. In this study, the application of the method of normalization (NLRR) to V and La caused that many so-called abnormal values were removed from the TE and REE set. The results obtained by NLRR and CDFIP methods differed even by more than 50% in the case of some elements. The calculated GBV values are used later to determine the degree of contamination of sediments by means of geochemical indices, which are calculated for each element (single indices) or for the whole set of TEs and REEs (integrated indices). GBV values enable one to assess possible anomalies and to provide information on the diffusion and dispersion patterns of pollutants inside the monitored area53. In addition, they may be useful for identifying the pollution source15. The results of Igeo, EF, PLI and MPI indices indicate that TE and REE concentrations in the bottom sediments of the BTNP lakes result mainly from natural geochemical processes. In individual cases, higher concentrations may indicate the occurrence of anthropogenic sources. Similar results were obtained by Raut et al.25, which indicated that only in the case of a few elements was the influence of anthropogenic factors, mainly long-range transport of atmospheric pollutants, observed. Strong correlation between REEs with the exception of Yb indicates the same source of origin of this group of elements. Slightly different results were obtained by Wang et al.8, who also observed strong inter-group correlations with the exception of Ce, Eu and Gd. The difference in results may be related to different degrees of anthropogenic source interactions.

Another important element associated with geochemical analyses of bottom sediments of very large data sets is the division of samples into groups characterized by similar concentrations and identification of sources of elements in bottom sediments. Sahoo et al.68 stated that the types of statistical processing and data manipulation are critical for understanding the role of various factors in the final geochemistry of the lake sediments. Multidimensional statistical methods are most frequently used for this purpose, among which cluster analysis and principal component analysis dominate9,27,28,29. Aitchison66, Filzmoser et al.67, Sahoo et al.68 indicates that data transformation should be carried out before statistical analysis. The clr transformation destroys the closure effect and ‘opens’ or ‘un-constrains’ the data to reveal inherent patterns in the data structure that can provide straight forward interpretation. In addition, the selection of representative elements to characterise the contamination of bottom sediments is important to simplify the analysis of large data sets. As this study has shown, it is possible to apply multiple regression models, which allow one to identify relationships between the TEs and REEs and the geochemical indices of PLI and MPI.

## Conclusion

The analysis of the study showed that:

• the variation of TEs and REEs in the bottom sediments of the lakes located in the BTNP is mainly due to the specific geological structure of this area, which determines the geochemical processes,

• anthropogenic sources only for Ostrowite and Krzywce Wielkie lakes had an impact on concentrations of selected TEs and REEs,

• the calculation of the geochemical background value by different methods has an uncertainty of approximately 71% on average with variation of 43.0–92.5%,

• the most reliable method to determine the geochemical background value and also to detect atypical values is the method based on CDF and the inflexion point,

• the most effective elements for characterizing the pollution of bottom sediments of lakes in the BTNP among TEs are: Cr, Co, Cu, Ag, Cd, Zn, Bi, Re, Ba, Al and Rb,

• the most effective elements for characterizing the pollution of bottom sediments of lakes in the BTNP among the REEs are: Nd, Gd, Yb, Lu, Eu, Dy and Ce,

• the lakes in the BTNP can be divided into four and two subgroups with similar TE and REE contents, respectively.