Introduction

Transition metals are a group of elements in groups IIIB to IIB of the Periodic Table of the elements. The last electron in a transition metal normally fills the secondary outer layer d orbital, resulting in low ionization energies and various, multiple valences. Transition metals, with richer chemical characteristics than main-group elements1, are an important class in the Periodic Table. They are widespread in aquatic environments, mostly at low concentrations, but can exert detrimental effects on aquatic life and human health. Water quality criteria (WQC) are the scientific foundation for assessment or qualities of aquatic environments and risk management. The United States Environmental Protection Agency (USEPA) published the first WQC guidelines, referred to as the “Red Book”, in 1976. The document proposed criteria maximum concentrations (CMCs) for nine transition metals2. The USEPA has subsequently updated WQC guidelines seven times in the past 40 years3,4,5,6,7,8,9. In the latest guideline, the USEPA recommended CMCs for only 10 metals for protecting aquatic life; seven of them are transition metals8, i.e., chromium (Cr (III), Cr (VI)), nickel (Ni), copper (Cu), zinc (Zn), silver (Ag), cadmium (Cd) and mercury (Hg).

Due to the lack of data on toxic potency of metals, WQC for more than 50 other transition metals have not yet been promulgated by regulatory jurisdictions. The reason for this is a general lack of empirical information on toxic potencies of these elements to model aquatic species, which is needed to derive WQC10. However, because tests to determine toxic potencies are often costly and time-consuming, they are not available for many species and in particular rare for endangered species that would be the focus of protective WQC. Furthermore, toxicities of some non-essential transition metals are greater than those of the main-group elements.

There are two indirect methods that have been used to predict toxic potency of metals for which toxicity data were insufficient. The first method is the interspecies correlation estimations (ICE) model, intended for species that can not be tested and is therefore used to extrapolate from toxicity data for surrogate species11. The second method is based on quantitative structure-activity relationships (QSARs), which are correlations between physicochemical properties and toxic potencies of target compounds12,13. These methods are not adequate for data-poor, non-essential transition metals for which data are available for only surrogate or common species. Therefore, new methods to directly predict WQC of transition metals using minimal toxic data were desirable.

Using fewer species and making better predictive models are the future integrated strategies of toxicology14. Critical mechanisms of toxicities of metals are often associated with their electronic structures and key physicochemical properties, such as binding affinity with biological macromolecular ligands15. Hence it has been proposed that physiochemical parameters can be used to develop models to predict toxic potencies of metals16. Because they are similar in electronic structures, transition metals can have similar physicochemical properties and mechanisms of toxicity17. For example, more than 20 physicochemical parameters, including softness, hydrolyzability, ionizability, complexing ability and geometric characteristics, have been shown to correlate with biological activities16. Alternatively, methods recommended by the USEPA, such as toxicity centile rank, SSDs and evaluation factors, all utilize data on toxic potency to several species to derive both WQC and CMCs18.

To demonstrate this structural property-based approach, empirical relationships between the USEPA-recommended CMCs and physicochemical properties of seven transition metals were established. After the most relevant parameters were selected, a model was established to predict CMCs of 49 other transition metals in the fourth, fifth, sixth and seventh periods of the Periodic Table of the elements, including the Lanthanide and Actinide Series. The predicted values were then compared with toxicity data from the literature, so as to examine the utility and reliability of the predictive model.

Results and Discussion

Single Physicochemical Properties-CMCs Relationships of Transition Metals

Twenty-six descriptors of physicochemical properties were considered in constructing models to predict CMCs by use of single-parameter linear regressions (Table 1). Seven structural parameters, including atomic number (AN), relative atomic weight (AW), covalent radius (CR), Pauling ionic radius (r), atomic ionization potential (AN/∆IP), softness index (σp) and electron density (AR/AW), were found to reasonably correlate with the CMCs of the seven transition metals recommended by the USEPA (R2 > 0.5 and P < 0.05; Table 2). It is therefore possible to develop empirical models by use of physicochemical properties and recommended CMCs for the seven transition metals, which can be employed to predict CMCs of other transition metals.

Table 1 Values of criteria maximum concentrations (CMCs) recommended and 26 physical and chemical properties for seven transition metals.
Table 2 Pearson product-moment parametric correlation of 26 characteristics of metal ions and the criteria maximum concentrations (CMCs) values by US EPA.

The parameters, AN, AW, CR, r and AN/∆IP were significantly and negatively correlated with CMCs (Supplementary Fig. S1A–E). This result is consistent with previously reported findings that the toxic potency of a metal is determined by its electronic configuration (AN/∆IP)16, AN16,19 and AW19. Significant correlations between LD50 and AN for some mammalian and between EC50 and AW of Daphnia magna have also been reported20,21. AN/∆IP, represents the difficulty of metal ions to form covalent bonds due to configurations of their electrons and subsequent crystalline structures. In addition, ∆IP is an indicator of change in ionization potential between ion oxidation numbers OX and OX−1. As a result, the potential for forming stable complexes between metal ions and biological ligands is directly related to toxic potencies of transition metals. Previous studies also indicated that AN/∆IP was negatively correlated with log EC50 (median effect concentration) of Lymnaea acuminata and LC50 (median lethal concentration) of Caenorhabditis elegans22,23. Parameters CR and r comprehensively describe the propensity of metal ions to form covalent and ionic bonds. In a similar study, Enache et al.24 noticed that increased inherent toxicity of metals was generally accompanied with increasing AN, CR and r of cabbage plants (Brassica oleracea L var capitata cv Soshu).

Alternatively, σp and AR/AW are positively correlated with CMCs, such that ions of metals with stronger hydrolysis and ionization potential have lesser toxic potency to aquatic organisms (Supplementary Fig. S1F,G). The softness index σp, derived by application of the Hard-Soft-Acid-Base (HSAB) theory, is indicative of the ability of metal ions to lose their valence electrons, while AR/AW is regarded as a measure of the electron density of ions. The results presented herein are consistent with those of previous studies. For instance, significant positive correlations between σp and LD50 determined in toxicity tests with mice were obtained for all hard, soft and borderline metal ions25. A positive correlation between AR/AW and EC50 values was also noted21.

Moreover, the two parameters with the largest coefficients of determination (R2) in PPCR models are σp (R2 = 0.75; F = 17.6 and P = 0.006) and CR (R2 = 0.62; F = 9.9 and P = 0.020). Consistently, σp is significantly and positively correlated with logEC50 and is the single best parameter used to predict relative toxic potencies of metal ions to a range of species, including Vibrio fischeri, Helianthus annuus Sunspot and four arthropods (Chironomus tentans, Planaria, Crangonyx pseudogracilis and Daphnia magna)22,23,26. However, in contrast to the results obtained in the present study, Khangarot et al.27 observed no significant correlation between CR and EC50 of Cypris subglobosa. The reason for such a discrepancy may be that these authors investigated the sensitivities among metals for a single species, whereas we considered threshold values for protecting all aquatic organisms.

Development of an Integrated Radius-PPCR Model

It has been difficult to predict relative potencies of metals by use of a single structural parameter13. Thus, it might be more appropriate to use common and easy-to-obtain physiochemical properties28. Because values for σp and ∆IP were scarce, data for AN, AW, CR, r and AR/AW, which are readily available, were used to predict CMCs. However, because there were multiple parameters with correlation coefficients greater than 0.65, the information produced by the models described in the preceding section was somewhat redundant. To address this issue and extract canonical relationships, PCA was used to reduce the number of independent variables to a small set of integrated variables. Contributions to PC by the reduced number of variables were determined all autocorrelations eliminated.

Because coefficients of determination of pairwise correlations between AN, AW and AW/AR were all greater than 0.87, PCA analyses were conducted on four different combinations of the parameters: (1) CR, r and AR/AW; (2) AN, CR and r; (3) AW, CR and r and (4) all five parameters. The accumulated proportions of the first PC were 88.8%, 87.8%, 85.7% and 85.0%, respectively, for the four PCA analyses (Table 3). Thus, the first PCs were all selected to construct the PPCR models with single-parameter linear regression (Table 3). Among the four regressions, X1= 0.567CR + 0.568r − 0.597AR/AW was the best fitted (R2 = 0.63, F = 10.2, P = 0.019). The results of internal cross-validation for the finally selected model was Qcv2 = 0.55 and RMSEcv = 0.32, which demonstrated that the model was robust. In addition, the results of the applicability domains were acceptable, indicating that the model could be applied for predicting CMCs of other metal (Supplementary Fig. S2, S3). Herein, X1 is defined as integrated radius (IR) related to AR, CR and r, which are all basic parameters for describing metal properties including toxic potency29.

Table 3 Regression models with principal components for criteria maximum concentrations (CMCs) at natural logarithmic scale, where R2 is the coefficient of determination, RSE is residual standard error, P is the statistical level of significance.

Some chemical and biological characteristics associated with adsorption and migration of ions are related to r30. For example, toxic potencies of metal ions are determined from their atomic orbital energies and r and metal ions with greater toxic potency mostly have multiple oxidation states31. In general, r and CR can be calculated from nuclear charge and electron configuration16. CR is also related to r32. IR accounts for the effects of the radius on toxicity and also averts redundancy. Thus, it was more accurate than a single parameter for predicting CMCs.

The IR-PPCR model (Fig. 1) predicted CMCs for all seven metals except for that of Cr were within the 95% confidence intervals of the CMCs predicted from IR. In addition, the difference between the CMC for Hg predicted by IR and the recommended value was within ± 0.20, whereas differences for all other metals were within an order of magnitude. These results suggest that the model based on IR is capable of reliably predicting CMCs for transition metals. The WQC for Cu derived by the SSD approach was 30 ± 0.6133 and 48 ± 0.27 μg/L34, which is close to the predicted values of 39 and 35 μg/L obtained in the present study. As for Cr, the difference between the predicted and recommended values of CMCs is greater, probably because Cr has different valence states.

Figure 1
figure 1

Predictive model for Criteria Maximum Concentrations (CMCs) on a natural logarithmic scale and integrated radius (X1) at 95% centile.

Data points of CMCs predicted from integrated radius (IR) are plotted as and the data points for USEPA-recommended CMCs are plotted as . The purple, dashed line illustrated the 95% confidence interval.

Three factors can explain uncertainties due to the use of different radii in IR, which was responsible for the discrepancy between predicted and recommended CMCs of the seven transition metals. First, substantially different predictions may be obtained if different ion radii are used. The radii reported by different groups for the same metal are not always identical35 and an ion radius can be classified as several types, including Lande, Wasastjerna, Goldschmidt or Pauling36. The inter-nuclear distance between a positive and a negative ion is the sum of their radii, but the boundary between them is quite difficult to determine. Second, both Cu and Zn always occur as +2 cations in freshwater and thus can form stable complexes with hydroxo and carbonato- complexes37. The order of stability constants for +2 cations of first-row transition metals to form a complex with a ligand, called Irving-Williams stability series, is Cd2+ < Mn2+ < Fe2+ < Co2+ < Ni2+ < Cu2+ > Zn2+. Because Zn uses 4s4p2 tetrahedral orbitals, it often forms weaker complexes with organic ligands than other transition metals38. The effect of the ligand field in this case may cause uncertainties associated with r values used in the present study. Finally, AR can not be determined directly and it is often measured with the assumption that the structure of metal atoms is spherically symmetrical. Similar to r, different values of the same metal radius also can be measured and calculated by different groups, such as Slater36 and Pauling39. Therefore, the use of different AR values may have caused the different results.

Prediction and Comparison of Criteria Maximum Concentrations

CMCs of 56 transition metals in the fourth, fifth, sixth and seventh periods, including the lanthanide series and the actinide series, were predicted from the IR-PPCR model (Fig. 2A). Predicted CMCs of the lanthanides and actinides are similar (Fig. 2B,C). To facilitate pattern recognition, metals of the same period are divided into three groups, i.e., IIIB−VIIB, VIII and IB−IIB. Within the same period, CMCs increase with increasing atomic number for all three groups (Fig. 2D). Within the same group of the Period Table, CMCs are inversely proportional to atomic number (Fig. 2D).

Figure 2
figure 2

Predicted Criteria Maximum Concentrations (CMCs).

(A) Periodic Table of CMCs for transition metals, showing CMCs recommended by US EPA and predicted by the integrated radius-PPCR (Physicochemical Properties-CMCs Relationships) model. (B) The predicted CMCs of the lanthanides. (C) The predicted CMCs of the actinides. (D) Comparison among the predicted CMCs in the forth (blue), fifth (red) and six period (green). The x axis of this graph is the group from IIIB to IIB, the y axis of the graph is the concentrations of the predicted CMCs and the z axis is the periods.

Median acute, lethal (LC50) concentrations, determined for 31 transition metals in one-week exposure experiments with Hyalella azteca (Crustacea) collected from Lake Ontario40, were correlated with the predicted CMCs. Exceptions were observed for yttrium (Y) and niobium (Nb), Cu and Zn, Ag and Cd and gold (Au) and osmium (Os) (Fig. 3A,B,D). Within the same group, the sequences of LC50 concentrations and predicted CMCs are identical for the pairs of vanadium (V) and Nb, Cu and Ag and Zn and Cd (Table 4). In addition, differences between LC50 concentrations or predicted CMCs with respect to atomic number are similar for the lanthanide and actinide series (Fig. 3C,E), probably because they are comparable in electronic structures in outside orbitals. If nominal LC50 concentrations40 are used for comparison, their sequences are the same as those of predicted CMCs for Y and Nb, Ag and Cd and Nb and tantalum (Ta) (Table 4). It should be noted that Au was excluded from the above assessment because there are insufficient toxicity data for this relatively unreactive metal.

Table 4 Comparison among Criteria Maximum Concentrations (CMCs) predicted by the model based on integrated radius (IR), median lethal concentration (LC50) for the fresh water amphipod (Hyalella azteca, Crustacea) in Lake Ontario (Burlington city tap, Canada) in soft water (nominal) and soft water (measured), for group IIIB, group VB, group VIII, group IB and group IIB.
Figure 3
figure 3

Comparison among Criteria Maximum Concentrations (CMCs) predicted by the model based on integrated radius (IR) (), median lethal concentration (LC50) for the fresh water amphipod (Hyalella azteca, Crustacea) in Lake Ontario (Burlington city tap, Canada) in soft water (nominal) () and soft water (measured) (), for seven transition metals in the fourth period (A), five transition metals in the fifth period (B), five transition metals in the sixth period (D), 14 lanthanide series metals (C) and two actinide series metals (E).

Toxic potency values expressed as CMCs and LC50 are similar between the lanthanide and actinide series metals (Fig. 3C,E). The lanthanides and actinides are similar in configurations of outside orbitals of electrons, which explains why most of their physical and chemical properties are similar. Lanthanides and actinides are also distinctly different from other elements in terms of physical and chemical properties because they have electrons in the f orbitals. The energy of the 4f sub-shell of lanthanides is lower than that of the 5d sub-shell for lanthanide metals, hence electrons fill the 4f sub-shell before the 5d sub-shell41. The “Lanthanide contraction”, another important feature of the lanthanide series in which the 5s and 5p orbitals penetrate the 4f sub-shell, results in the 4f orbital being exposed to the increasing nuclear change42. As a result, the atomic radius exhibits a decreasing trend throughout the series. This change in “charge density” might explain the difference in toxic potencies among the lanthanides. Therefore, the r and other physicochemical properties of the lanthanide metals beyond Eu in the Period Table are similar to those of Y and these metals have similar LC50 concentrations and predicted CMCs as Y.

Actinides can form chemical compounds in solutions as cations with relatively large ionic radii43. Similar to the lanthanides, energies of the 6s and 6p sub-shells of actinides are greater than that of the 5f sub-shell; therefore electrons fill the 5f sub-shell before the 6s and 6p sub-shells. Therefore, both the lanthanides and actinides have the ability to form stable complexes with ligands, such as chloride, sulfate, carbonate and acetate. Moreover, some lanthanides and all actinides are radioactive44,45 and also exhibit characteristics of heavy metals, such that they are often considered toxic to aquatic life at ambient concentrations46. These results further corroborate the accuracy of the model based on IR in predicting toxic potency of metals (Fig. 2).

There is an apparent difference in the patterns of toxic potencies and predicted CMCs for some transition metals, probably since only the metal physiochemical properties are considered in the model based on IR, without considering effects of characteristics of natural water. To predict the effects in surface waters, the results predicted by the model need to be adjusted to account for metal speciation and chemical activity or apparent concentrations in both fresh and marine water. Due to cation competition and formation of biotic ligands by use of models that predict metal speciation by combining with the Biotic Ligand Model (BLM), free ion activity model (FIAM) and gill surface interaction model (GSIM)47. The BLM assesses metal toxicity to aquatic organisms over a range of hardness, pH and dissolved organic carbon (DOC) by providing a quantitative framework48 and has been employed as a good solution to the problems associated with WQC for Cu49. However, it has been only used to predict the toxicity of a few metals such as Cu, Ag, Cd and Ni to a few species, including Salmo gairdneri, Pineohales pronelas, Daphnia magna, Ceriodaphnia dubia and Daphnia pulex50. In general, if the BLM can not be used, the toxicity data used to derive WQC need to select under a constant pH such as ranged from 6 to 8 and be hardness-normalized by use of hardness algorithms, for which might be not concerned about effect of organic complexation. While further development and improvements of the predictive model are necessary and their range of applicability needs to be determined, the predictive model provides a promising screening level tool that can be used for rapid prediction of the criteria of the metals without any toxicity data and water quality and risk assessment.

Importance and Uncertainties

Transition metals under investigation in the present study behave variably due to their individual physical and chemical properties; they have been widely used not only in industrial products but also in daily life. However, most transition metals exhibit significant toxic potency, some of them are even radioactive. Because of the difficulty to conduct experiments on these transition metals, there are few data on toxic potency to a range of species. As a result, it is difficult to establish water quality standards, conduct water quality assessment and practice risk management. Models obtained in the present study could be useful for deriving threshold values for data-poor transition metals. More importantly, the results of the present study demonstrated correlations between the physicochemical properties of transition metals and WQC and toxic potencies of metals. The modeling approaches used in the present study have also opened up a new dimension for investigating the complex environmental behavior and toxic action of transition metals, which is important for examining toxic potency and threshold values for other metals as well.

Although the IP-based model developed in the present study can reasonably predict CMCs of transition metals with limited information, experimental verification and subsequent modifications of the model are deemed necessary in future studies. In addition, the metal valence and the effects of water chemistry on toxic potency should also be considered in further modifications of the model. Nevertheless, the predictive model provides a new approach for WQC development and water quality assessment of metals.

Methods

Preparation of CMCs and Physicochemical Properties of Selected Transition Metals

Seven transition metals (Cr(III), Cr(VI), Ni, Cu, Zn, Ag, Cd and Hg) for which CMCs have been recommended by the USEPA9 were selected as “test elements” in the training set of elements, to which the results of the predictive models could be compared. Based on the results of several previous studies12,28,51,52,53,54, 26 structural parameters characterizing various physical and chemical properties of the metal ions were investigated. They include AN12, AW28,52, AR28,52,53,55, CR51,52,53, r12,28,52,53, melting point (MP)52, density (D)52, enthalpy (heat) of vaporization (Eh)52, boiling point (BP)52, difference in ionization potentials between the ion oxidation numbers OX and OX−1IP(eV))12,28,54, electrochemical potential (∆E0(V))12,28,52,54, log of first hydrolysis constant (|logKOH|)12, covalent index (Xm2r)12,24, polarization force parameters (Z/r,Z/r2 and Z2/r)12,24, σp12, ionization potential (IP)12,23,24,26, electronegativity (Xm)12,23,24, AN/∆IP12,23,24,26, AR/AW23,24, electronegativity index (x)12,23, relative softness (Z/rx) (x is a electronegativity value index)12,23,24, similar polarization force parameters (Z/AR and Z/AR2)12,28,52,53 and ionic charge (Z)26. Some of these parameters such as Z/AR,Z/rx and Z2/r were recalculated to fit the model. Moreover, because the variables used to describe environmental concentrations often follow a lognormal frequency distribution, values of the descriptors were transformed to natural logarithm before use56.

Statistical Analysis

Based on results of Pearson correlations analysis, 26 parameters were correlated with CMCs of the target metals recommended by the USEPA (Table 1), so that relationships between physicochemical properties and CMCs could be developed. Selected parameters and CMCs were used as independent and dependent variables, respectively. These Physicochemical Properties-CMCs Relationships (PPCR) models were developed based on multiple linear regressions of those parameters with the greatest correlations and thus predictive power. Selected parameters and CMCs were used as independent and dependent variables, respectively. Principal component analysis (PCA) was used to manage multivariate variables by transforming relationships from a higher-dimensional space to a lesser-order dimensional space, which simplified and optimized the information in the multivariate data. After linear regression of the original variables, several newly created variables expressed as principal components (PCs) can optimally represent the dynamic and interactive relationships among the original variables57. Since these comprehensive indices are perpendicular and minimally related, they can provide key non-redundant information about the original parameters. The first principal component (PC) generally explains the largest portion of the variation. While the number of PCs derived is equal to the total number of parameters included in the PCA, the number of PCs was chosen in the model so that greater than 85% of the total variance could be explained58. By using the PCA regression approach, the best correlation between the first principal component X1 and the recommended CMCs of the target metals was obtained. The model obtained by linear regression was used to predict CMCs for other transition metals. Principal component linear regression analyses were carried out by use of the R programming language and MATLAB (Mathworks, Natick, MA, USA). The predictive potential of the model was evaluated with the coefficient of determination (R2), residual standard error (RSE), the value of F-test statistic using analysis of linear regression fit and the level of Type I error (P) with the level of significance at α < 0.05.

Model Validation

To reduce the probability of over-fitting and test the robustness of the model, internal validation was evaluated with k-fold cross-validation correlation coefficient (Qcv2), for which recommended minimum acceptable value is 0.5 and cross-validated root mean square error of prediction (RMSEcv)59. Moreover, predictions of WQC and toxic potencies of metals are valid only if the properties of such metals are within the applicability domains of the developed QSAR models. The applicability domains of the developed QSAR models were evaluated with the hat value and Williams plot60. The hat value hi for each ith metal was calculated with , where xi is a row vector of the parameter for an ith metal used to establish the QSAR model. The hat value hi should be smaller than the warning h* value, i.e., the predicted CMC of an ith metal is located within the optimum applicability domains. The h* value was calculated with , where p is the variables number used in the model and n is the number of recommended CMCs for metals.

Additional Information

How to cite this article: Wang, Y. et al. Directly Predicting Water Quality Criteria from Physicochemical Properties of Transition Metals. Sci. Rep. 6, 22515; doi: 10.1038/srep22515 (2016).