Introduction

Oligomeric surfactants with multiple hydrophobic and hydrophilic groups are a new class of surfactants which become attractive in recent years. Dimeric surfactant, also called gemini surfactants, are simplest oligomeric surfactants which contain two hydrophobic and two hydrophilic groups connected by the spacer group. Trimeric surfactants are natural extension of the gemini surfactants. Their structure represent intermediate structure between dimeric surfactants and higher oligomeric ones. The first report on trimeric surfactants was by Raoul Zana et al.1,2,3. Generally, the oligomeric surfactants are constructed by two or more hydrophobic tails and polar head groups linked by the spacer groups. The spacer groups of trimeric and higher oligomerization degree surfactants can be linear, ring-like or star-shaped therefore the oligomeric surfactants are categorized into linear, ring-like and star-shaped, and their aggregation behavior strongly depends on their topological structures4. The star-shaped trimeric surfactants exhibit more unique self-aggregation behavior in aqueous solution compared to their linear analogues4.

Generally, the trimeric surfactants show an excellent surface-active and self-aggregation properties in aqueous solutions. The experimental studies5,6,7,8,9 and also the theoretical considerations10 show that the critical micelle concentration decreases with the increasing the degree of oligomerization. Thus, the critical micelle concentration of trimeric surfactants is much lower than those of the corresponding monomeric and dimeric analogues. Also, the cationic trimeric compounds exhibit strong antimicrobial activity. They are active against board range of microorganism such as bacteria and fungi11,12 and probably can be used against some viruses.

The main aim of this work is designing the structure of cationic star-shaped trimeric surfactants most active in micelle formation otherwise to study the effect of the structure modifications on the critical micelle concentration value. This work is a continuation of studies on structure–property relationship (QSPR) of cationic surfactants13,14,15. The subject of the previous studies was the critical micelle concentration (cmc) of cationic monomeric13 and cationic gemini14,15 surfactants. In the papers13,14,15, the relationship of the critical micelle concentration and the structure of cationic surfactants was investigated using the molecular connectivity indices only. In the present work these topological indices were also used to study the influence of the chains structure on the value of critical micelle concentration of star-shaped cationic trimeric surfactants. As was suggested in paper4 the topological structure of the oligomeric surfactants strongly affect the aggregation behavior of these compounds. Therefore, the topological descriptors like the molecular connectivity indices16 can be very good representation of the oligomeric surfactant structure in studies of self-aggregation properties, and in particular of the critical micelle concentration, as was shown in the previous papers concerning the gemini surfactants14,15.

The critical micelle concentration depends not only on geometrical structure but also on a number of other parameters, among them a kind of counterion. Therefore, in order to minimize the influence of factors other than geometrical on cmc value, only the star-shaped cationic surfactants with bromides as counterions were taken into account. Although, the obtained QSPR model has been derived for compounds with fixed counterions, it can be used to design the structure of cationic trimeric star-shaped surfactants with any kind of counterion because the impact of modifying the geometrical structure on changes in the cmc value, its decrease or increase, should be the same.

The semi-empirical calculations of the atomic charges were also performed to investigate the effect of branches and heteroatoms contained in the spacer group on the critical micelle concentration of studied trimeric surfactants.

Results and discussion

All investigated cationic trimeric surfactants are star-shaped type surfactants. The geometrical structures of investigated compounds are significantly different. The entire data set includes thirteen training set compounds (compounds 113) and five test compounds (compounds T1T5). The structures of the molecules taken from literature along with the logarithms of the literature cmc values are shown in Methods section.

Molecular connectivity indices were calculated basing on the graphic structural formula of the molecules 113 using the expressions contained in Methods section (Eqs. 35). The values of five molecular connectivity indices from zero to fourth order and five valence molecular connectivity indices from zero to fourth order of training compounds are given in Table 1.

Table 1 Values of molecular connectivity indices.

Based on the values of connectivity indices (Table 1) and the logarithms of literature cmc values of training compounds 113, using the polynomial regression analysis and stepwise method, the two-variable equation (Eq. 1), expressing the relation between logarithm of cmc and the molecular connectivity indices, have been obtained:

$$ Log_{10} cmc = - (1.10878 \mp 0.18486) + (0.00428 \pm 0.00058)\, \cdot \,({}^{2}\chi )^{2} - (0.00709 \mp 0.00055)\, \cdot \,({}^{1}\chi^{\nu } )^{2} $$
(1)

The statistical characteristics of the equation variables are given in Table 2.

Table 2 Statistical characteristics of variables included in Eq. (1).

The obtained model (Eq. 1) contains first-order valence molecular connectivity index \({}^{1}\chi^{\nu }\) and second-order molecular connectivity index \({}^{2}\chi\). The first-order valence molecular connectivity index \({}^{1}\chi^{\nu }\) is path-type index and it represents one-bonds fragments in a molecule. The values of this index depend on the isomers of the compound16 and decrease with increasing in branching. The molecular valence connectivity index \({}^{1}\chi^{\nu }\) is a valence connectivity index thus differentiates heteroatoms and multiple bonds. The second-order \({}^{2}\chi\) molecular connectivity index appearing in the model (Eq. 1) represents two-bonds fragments in the molecule. The values of this index also depend on the isomers of the compound16, and in this case the values increase with increasing in branching. Also, the values of \({}^{2}\chi\) and \({}^{1}\chi^{\nu }\) indices increase with the increase in the number of atoms in the molecule by extending the hydrocarbon chains or adding atoms to the chains through branching.

From Table 1 it can be concluded that the values of \({}^{2}\chi\) index are smaller compared to the values of \({}^{1}\chi^{\nu }\) index. The obtained model (Eq. 1) contains the \(({}^{1}\chi^{\nu } )^{2}\) variable with a negative coefficient and the \(({}^{2}\chi )^{2}\) variable with a positive one, and the absolute value of these coefficients are 0.00709 and 0.00428, respectively. Therefore, the analysis of equation (Eq. 1) allows us to conclude that as the \({}^{1}\chi^{\nu }\) index increases, the cmc decreases.

The graphical comparison of the calculated \(Log_{10} cmc\) values using Eq. (1) and the experimental \(Log_{10} cmc\) values is shown in Fig. 1.

Figure 1
figure 1

Scatter plot of the calculated \(Log_{10} cmc\) versus the experimental \(Log_{10} cmc\) for compounds of training set (r = 0.981, F = 280.480, s = 0.173, r2adj = 0.959, \(Q_{LOO}^{2}\) = 0.925) (rhombus) and test set (\(R_{pred}^{2}\) = 0.531) (triangle).

The \(Log_{10} cmc\) values calculated using Eq. (1) and the experimental \(Log_{10} cmc\) values for studied trimeric surfactants, along with the values of residuals, are contained in Table 3.

Table 3 Calculated and literature values \(Log_{10} cmc\) of compounds from training and test set.

The plot of residuals versus the experimental values of \(Log_{10} cmc\) is shown in Fig. 2.

Figure 2
figure 2

Plot of residuals versus the experimental \(Log_{10} cmc\) values for training set (rhombus) and test compounds (triangle).

As can be seen from Table 3 and Figs. 12, the calculated values of \(Log_{10} cmc\) using Eq. (1) are very close to the from literature ones.

As shown in Table 6 given in Methods section, the Eq. (1) has been obtained based on molecules with a different topological structure. These compounds differ in the length of the hydrophobic chains as well as the structure of spacer groups. Therefore, it is worth analyzing, based on the obtained equation, the influence of hydrophobic tails and spacer group on the values of logarithm of cmc.

Hydrophobic tails effect

The experimental cmc values and thus \(Log_{10} cmc\) of star-shaped cationic trimeric surfactants decrease when the tails are lengthened between ten and fourteen (compounds 13) or ten and sixteen (compounds 46 and T4) carbon atoms. The corresponding plots of the calculated, using Eq. (1), and the experimental \(Log_{10} cmc\) versus the alkyl chain carbon number of compounds 13 and compounds 46 and T4 are shown in Figs. 34.

Figure 3
figure 3

Calculated (square) and experimental17 (triangle) \(Log_{10} cmc\) values of star-shaped trimeric compounds 13 versus the number of alkyl chain carbon atoms (n).

Figure 4
figure 4

Calculated (square) and experimental18 (triangle) \(Log_{10} cmc\) values of star-shaped trimeric compounds 56 and T4 versus the number of alkyl chain carbon atoms (n).

As shown in Figs. 34, the \(Log_{10} cmc\) values calculated using Eq. (1) also decrease with tails lengthening from ten to eighteen (Figs. 34) carbon atoms, and as shown in Figs. 34, with this range the dependence of both the calculated and also experimental logarithm values of cmc on the number of carbon atoms in the alkyl chains is linear. As can be seen in Fig. 34, the calculated \(Log_{10} cmc\) values are very close to experimental values. For experimental \(Log_{10} cmc\) values of compounds in Fig. 4 there is a little deviation from linearity for the twelfth carbon atoms chain, but this deviation is probably within the margin of experimental error. Thus, it can be concluded that in the range between the ten and the eighteen carbon atoms the dependence of \(Log_{10} cmc\) on the number of alkyl chain carbon atoms (n) of star-shaped cationic trimeric surfactants can be described by linear function.

However, in the case of longer chains, above eighteen carbon atoms, the dependence of the \(Log_{10} cmc\) values calculated using Eq. (1) on the number of carbon atoms will be non-linear due to the non-linear nature of this equation. The non-linear dependence of logarithm of cmc on the chains length has been observed for various cationic gemini surfactants of different nature and flexibility spacer groups or tails25,26,27. Also, the non-linear relationship between \(Log_{10} cmc\) and the number of tail carbon atoms for cationic gemini surfactants was obtained basing on theoretical model28.

Spacer group effect

The structure and nature of the spacer group play an important role in micelle formation of oligomeric surfactants. The studies on gemini surfactants29,30,31,32,33 show that such features of the spacer group as flexibility and hydrophobicity have significant impact on the aggregation behavior in aqueous solution of these compounds. Therefore, the effect of such factors which influence the flexibility and hydrophobicity as spacer’s length, branching and heteroatoms on the cmc of star-shaped trimeric surfactants have been investigated using Eq. (1) (Table 4). In Table 4 the star-shaped type trimeric surfactants with different spacers groups structures and corresponding \(Log_{10} cmc\) values calculated using Eq. (1) are shown.

Table 4 Calculated \(Log_{10} cmc\) values of star-shaped trimeric surfactants with different spacer group (S1-S10).

The inspection of data contained in Table 4 shows that the greatest value of critical micelle concentration is for compound S6 and the lowest for compound S1.

In the case of compounds with different central groups in the spacer group (compounds S1S4), the lowest cmc value is for S1 compound with the cyclohexane-based spacer group, and the highest for compound S4 with the central nitrogen atom in the spacer group. This lowest cmc value of compound S1 may be due to the fact that the spacer is more hydrophobic which may cause its incorporation into the interior of the formed micelle. Also, as shown in Table 4, the cmc value of compound S2 is lower than cmc value of compound S4. Similar results was obtained by Nacham et al.21. The authors of paper21 have studied among others the ionic liquid-based star-shaped trimeric surfactants containing triethylamine and triethylbenzene spacer groups. As was shown in21 the cmc value of IL-based trimeric surfactant containing triethylbenzene spacer groups is lower than the cmc of that containing triethylamine spacer group with identical alkyl tails and head groups. The authors of this paper also suggest that this difference of cmc values may be due to the higher hydrophobicity imparted by the benzyl core in investigated IL-trimeric surfactant. Table 4 also shows that cmc value of compound S1 is lower compared to the cmc value of compound S2. This difference may result from the fact that substituted cyclohexane ring is more flexible compared to the benzene one, and thus the cyclohexane-based spacer group easily incorporates into the interior of the formed micelle.

Among the compounds having nitrogen as the central group and the same number of non-hydrogen atoms in the spacer group (compounds S4S6 and S8S10), the lowest critical micelle concentration is for compound S4 and the greatest for compound S10. In the case of the compounds with straight spacer chains (compounds S4S6), the cmc value increases in the order: –CH2– < –NH– < –O–. The similar order is for compounds with branched spacer group (compounds S8S10), the order of increasing cmc value is: –CH3 < –NH2 < –OH. In addition, the comparison of the critical micelle concentration values of compounds with straight chains of star-shaped spacers (compounds S4S6) with the corresponding compounds having branched spacer group (compounds S8S10) shows that the critical micelle concentration values of compounds having the branched spacer groups are greater compared to the cmc values of the corresponding compounds with straight spacer’s chains.

In summary, Table 4 shows that in the case of star-shaped spacers having the same number of non-hydrogen atoms, the branches and the heteroatoms cause the increase the critical micelle concentration value.

The effect of spacer’s branches and heteroatoms on cmc value of compounds presented in Table 4 (compounds S4S6 and S8S10) was also analyzed using the atomic charges. The total charges of different spacer’s functional groups found in the Y position of compounds S4S6 and S8S10 are presented in Table 5.

Table 5 Total charges of the different functional groups found in the Y position (see Table 4).

The AM1 semi-empirical calculations (Table 5) show that the atoms of high electronegativity such oxygen and nitrogen introduce a great negative charge to the spacer group, greater than the carbon atom thus the total charges of functional groups containing heteroatoms are negative. Moreover, the total charges of the various functional groups shown in Table 5 show that for compounds both straight and branched spacer group the total charge changes from positive for functional group with carbon atom to negative for functional groups with nitrogen or oxygen. Additionally, the negative charge is greater in the case of a group with an oxygen atom than with a nitrogen atom. Comparing these results with the data from Table 4, it can be concluded that the cmc values increase with the increase in the negative charge of the functional group.

The data in Table 5 also show that the total (negative) charge of the functional group, in the branched spacer chain compared to that of the corresponding functional group in the straight chain, is lower for branched spacer when the non-hydrogen atom in the functional group in Y position is oxygen or nitrogen, and greater (positive) charge also for branched spacer group when the non-hydrogen atom in the functional group in Y position is carbon atom. Comparing these results with the data from Table 4, it can be seen that when the non-hydrogen atom in the functional group is carbon atom, the results from Table 5 agree very well with those in Table 4 discussed above, i.e. the greater the positive charge, the greater the critical micelle concentration. In the case of functional groups containing heteroatoms (Table 4), it can be seen that the cmc value is greater for compounds with branched spacer group for which the negative charge of the functional group is lower than the negative charge of analogous functional group in straight spacer chain (Table 5).

The obtained model also shows that the cmc value of star-shaped trimeric surfactants decreases with the increase in the chains length of the spacer group (compounds S4 and S7). This is due to the fact that changes in the length of hydrocarbon tails have a significant impact on cmc values, and thus on obtained model. However, the experimental data34 shows the opposite conclusion. As shown in34, the cmc value of 3C12tris-s-Q cationic trimeric surfactants is greater for s = 6 than for s = 3. This suggests that the cmc increases with increasing spacer chains length (s) from three to about six methylene groups. A similar effect of spacer chains length is observed for dimeric surfactants29,30. The cmc values of 12–s–12 gemini surfactants increase with increasing number of spacer carbon atoms, up to four or five methylene groups, and then decrease with further elongation of the spacer chain29,30. Therefore, probably the increase in cmc value of trimeric surfactants with spacer chains length may have maximum at about six carbon atoms and then, as for gemini surfactants, the cmc may decrease with further chains length elongation. To the best of the author's knowledge, spacer chains lengthening above six methylene groups for star-shaped trimeric surfactants has not been studied. Therefore, the obtained model should be used probably for compounds with more than six carbon atoms in each chain of the trimeric surfactant spacer group when examining the effect of the spacer chains elongation on the cmc value.

Finally, if we examine only the changes in cmc value due to changes in the geometrical structure, the obtained equation (Eq. 1), although derived for bromide compounds, can also be used for compounds with another counterion, because the changes in cmc, its increase or decrease, should be the same. In other words, when examining changes in the cmc of bromide star-shaped trimeric surfactants, we can expect that in the case of chloride analogous compounds will be similar. For example, for chloride compounds 1,1,1-Tris[2-hydroxy-3-(dodecyldimethylammonio)-propoxymethyl]ethane Trichloride35 and the so-called III-12-4 trimeric surfactant36, for which the experimental cmc values are 0.223 (mM) and 0.15 (mM), respectively, the calculated values using Eq. (1) are 0.198 (mM) and 0.187 (mM), respectively. So, the calculated cmc value decreases as the experiment shows.

Methods

Experimental data

All investigated molecules are trimeric quaternary ammonium bromide surfactants. The structures of all considered compounds along with the logarithms of the literature cmc values17,18,19,20,21,22,23,24 are shown in Table 6.

Table 6 Molecular structures and logarithms of experimental cmc values of training (113) and test (T1T5) compounds.

The literature values of critical micelle concentration were given in (mM)17,18,19,20,21,22,23,24. The cmc values expressed in molar units (M) have been converted to logarithms of cmc (Table 6).

Most values of cmc were measured at 25 °C in aqueous solution. The cmc of compound T3 was measured at 22 oC.

Molecular connectivity indices

Molecular connectivity indices, some of the topological descriptors to characterize molecules in structure–property and structure–activity studies, are calculated from the molecular graph. The molecular graph is a graphical representation of the structural formula of the chemical compound, in which vertices represents atoms and edges symbolize covalence bonds.

The first connectivity index was proposed by Randić37 and was defined as:

$$ \chi = \sum\limits_{k} {\left( {\delta_{i} \delta_{j} } \right)}_{k}^{ - 0.5} $$
(2)

where \(\delta_{i}\) is a connectivity degree i.e. the number of non-hydrogen atoms to which the i-th non-hydrogen atom is bonded. The Kier and Hall molecular connectivity indices16 are generalizations of Randić’s connectivity index and the m-th order Kier-Hall molecular connectivity index is defined as16:

$$ {}^{m}\chi_{k} = \sum\limits_{j = 1}^{{n_{m} }} {\prod\limits_{i = 1}^{m + 1} {\left( {\delta_{i} } \right)_{j}^{ - 0.5} } } $$
(3)

where \(\delta_{i}\) is a connectivity degree, m is the order of the connectivity index, k denotes type of the fragment of the molecule for example: path (p), cluster (c) and path-cluster (pc), nm is the number of fragments of type k and order m.

The m-th order valence molecular connectivity index is defined16:

$$ {}^{m}\chi_{k}^{\nu } = \sum\limits_{j = 1}^{{n_{m} }} {\prod\limits_{i = 1}^{m + 1} {\left( {\delta_{i}^{\nu } } \right)_{j}^{ - 0.5} } } $$
(4)

where the valence connectivity degree \(\delta_{i}^{\nu }\) is defined as:

$$ \delta_{i}^{\nu } = \frac{{Z_{i}^{\nu } - h_{i} }}{{Z_{i} - Z_{i}^{\nu } - 1}} $$
(5)

where \(Z_{i}^{\nu }\) is the number of valence electrons in the i-th atom, hi is the number of hydrogen atoms connected to the i-th atom and \(Z_{i}\) is the number of all electrons in the i-th atom.

The values of the molecular and valence molecular connectivity indices of training compounds are contained in Table 1.

Atomic charges

Atomic charges were calculated using the semi-empirical molecular orbital package MOPAC 7 included in the VEGA program38,39, employing the semi-empirical AM1 method. All calculated atomic charges are expressed in atomic units (a.u.).

Statistics

Each formula expressing the relationship between the \(Log_{10} cmc\) and molecular connectivity indices was generated using the least-squares method and the final equation was obtained using the stepwise method. Pearson correlation coefficient (r), the adjusted coefficient of determination (r2adj), the standard deviation of the fit (s) and the Fisher ratio (F) were used to select the best model. The model obtained was selected according to following principles: highest correlation coefficient, adjusted coefficient of determination and Fisher value, the lowest standard deviation of the fit and also the smallest possible number of significant descriptors in the model. For good quality QSPR model the positive value of correlation coefficient (r) should be closer to 1. High values of F-test indicate that the model is statistically significant. The number of variables in the model should not exceed the number of compounds divided by five40. The statistical characteristics of the equation variables includes standard error, t-value and p-value. High absolute Student t value of the variable expresses that the coefficient of the variable is significantly larger than the standard error. Variable with p value below 0.05 is considered statistically significant41.

The leave-one-out cross-validated correlation coefficient (\(Q_{LOO}^{2}\)) and predictive correlation coefficient (\(R_{pred}^{2}\)) were also used to indicate the internal and external validation of derived model. The model is considered to be excellent, if \(Q_{LOO}^{2}\) is equal or more than 0.942. For acceptable QSPR model, the values of \(Q_{LOO}^{2}\) and \(R_{pred}^{2}\) should be more than 0.540,43.

The statistical calculations were made using the program STATISTICA 1244.

Conclusions

The aim of the presented work was to find a simple equation expressing the dependence of the critical micelle concentration of cationic star-shaped trimeric surfactants on their geometrical structures represented by topological indices, that will allow for examining the impact of structure modifications on the cmc values and thus it can be helpful in the design of novel star-shaped cationic trimeric surfactants. Employing the polynomial regression analysis, the equation with two molecular connectivity indices was obtained. Using the obtained model, the influence of hydrophobic tails length and structure of star-shaped spacer group on the critical micelle concentration was examined.

The analysis of the influence of tails length in the range between ten and eighteen carbon atoms confirmed the experimental results and the Klevens equation45 which expresses a linear dependence of logarithm of cmc on the number of carbon atoms in the alkyl chains.

The analysis of star-shaped spacers with the same number of non-hydrogen atoms allows us to conclude that heteroatoms cause the increase the cmc. These results were confirmed by atomic charge analysis. Functional groups containing highly electronegativity atoms, such as oxygen or nitrogen, introduce into the spacer group the negative charges, which cause an increase in the critical micelle concentration. The spacer group analysis also reveals that the branches cause the increase the critical micelle concentration too.