Introduction

Numerous technologies exist for capturing CO2 including chemical absorption, cryogenic separation, removal with membranes, and adsorption with zeolites or metal–organic frameworks1,2,3,4,5,6. The cyclic chemical absorption and regeneration process based on common primary and secondary amines such as monoethanolamine (MEA) and diethanolamine (DEA) is the most mature in industrial applications3,5. Unhindered primary and secondary amines react rapidly with CO2 to form very stable carbamates. The amount of energy required for the regeneration of these solvents is large. Carbon capture applied to a coal-fired power plant may reduce the net output of the plant by 30%6. With sterically hindered amines or tertiary amines like the standard methyldiethanolamine (MDEA), CO2 is captured as bicarbonate, which has a much smaller heat of reaction than carbamate formation, resulting in regeneration energy savings7. Moreover, their CO2 absorption capacity is much higher. Tertiary amines are therefore increasingly used in the high-pressure natural gas treatment industry to remove acid gases like CO2. However, in general, the rate of direct bicarbonate formation is much lower than that of carbamate formation resulting in much slower CO2 absorption rates with tertiary amines and thus in unacceptable large equipment for low pressure, anthropogenic (flue gas), CO2 capture applications5,7. To tackle this problem, several approaches were suggested. Several studies reported that the usage of a catalyst allows one to speed up the absorption of CO2 and/or to lower the energetic cost of solvent regeneration8. Another option, which is currently followed by the industry, consists in adding an activator, piperazine, significantly boosting the overall CO2 absorption rate without increasing the regeneration energy too much9. A more straightforward strategy would be the identification of new tertiary amines with much higher absorption rates with respect to standard MDEA and to which piperazine can eventually be added. Since experimental measurement of CO2 absorption kinetics is a time and labor-intensive process, the rational approach to the design of tertiary amines that can rapidly absorb CO2 requires a quantitative model enabling to select only the best candidates for experimental measurements.

Concerning alternative processes based on adsorption in porous solids (still under development), a lower theoretical energy consumption is expected due to the weaker physical adsorption. Molecular simulations and machine learning have already been extensively used to perform virtual screening of hundreds of thousands of structures to identify potentially better materials for CO2 adsorption10,11. Until now it was not possible to apply a similar methodology for amines, because of the difficulty related to the computation of chemical reactions. Amines were rationally designed based on physical and thermodynamic properties and the CO2 absorption rates were measured experimentally for only the most promising candidates7,12. Previously, machine-learning algorithms were tentatively applied for modeling quantitative structure–property relationship (QSPR) of alkanolamines’ CO2 absorption-related properties13,14,15,16,17,18. However, the availability of only a very small amount of data points limited the applicability domain of the models. Hence, to address this issue, we developed and applied a methodology for the identification of tertiary amines effectively absorbing CO2 based on the combination of molecular simulations19 and machine learning. In parallel, an experimental setup for the measurement of CO2 absorption rates has been specifically designed and put in place to validate the approach.

Results and discussion

Design of the methodology for CO2 absorbents screening

The workflow of the methodology is presented in Fig. 1. Chowdhury et al.20 published a consistent experimental dataset of the absorption rates of CO2 for 24 aqueous tertiary amines (313 K, 30 wt% amine). In the absence of a clear relationship between the structure or the chemical properties (e.g., the basicity) of the amines and the CO2 absorption rates, we developed a molecular dynamics (MD) based model that can accurately predict those experimental CO2 absorption rates19. It was found that, while the basicity of the amine (quantified by the pKa) is important, the key to the precision of molecular simulations is the inclusion of subtle but important solvation effects in the calculation of the activation Gibbs free energy of the reaction with an accuracy better than 1 kJ mol−1. One of the important features of the MD model19 is the robustness to reasonable changes in the concentration of amine and in temperature, enabling to apply it to a rather wide range of experimental setups. Hence, the model was applied to predict the rates at 13 mol% of amines and at 323 K, because these conditions are more representative of industrial absorption5.

Fig. 1: Workflow of the methodology suggested in this paper.
figure 1

a A high precision molecular simulation-based model for absorption rate prediction is developed19 and validated with experimental data on CO2 absorption rates for 24 tertiary amines20. The accuracy of the Gibbs free energies of absorption is better than 1 kJ mol−1 in comparison to experimental values19. b This model is applied to a diverse dataset containing 100 tertiary amine structures to calculate the CO2 absorption rate (RMD) and free energy of absorption (ΔGMD) (see “Methods”). c QSPR models were built for RMD and ΔGMD. d QSPR models were used to select perspective commercially available amines from public datasets. e Experimental measurement of CO2 absorption rates for selected amines. f The most selective ones can be further studied and eventually tested in a pilot unit.

Being much less resource- and cost-demanding, molecular simulations can thus be used instead of the experiments to get enough data for building a reliable QSPR model with a wide applicability domain.

Molecular simulations of CO2 absorption process

A dataset containing 100 structurally diverse tertiary amines was composed based on the in-house TotalEnergies’s dataset of amines with known experimental properties, complemented with tertiary amines extracted from literature and public databases (PubChem21,22, ZINC23,24). The selected compounds comprise diverse chemotypes, including linear and cyclic amines, diamines, amines containing thiol and thioether groups. Molecular simulations (see “Methods”) were performed for the initial set of 24 amines and for the selected set of 100 amines at 323 K and using a 13 mol% concentration of amine. From MD simulations absorption rates (RMD) and free energies of absorption (ΔGMD) were obtained. Notably, the RMD values calculated at 313 and 323 K are highly correlated (Fig. 2a, Spearman rank correlation coefficient (ρ) 0.99).

Fig. 2: Results of molecular dynamics simulations of the CO2 absorption process.
figure 2

a CO2 absorption rates (RMD) at 313 K (gray) and 323 K (orange) predicted using MD and the experimental absorption rates (Rlit) at 313 K. b RMD vs predicted pKa values (pKa). c energy of absorption (ΔGMD) predicted by MD vs RMD. d ΔGMD vs predicted pKa. The 24 amines from Chowdhury et al.20 are shown in orange. The 100 amines for which MD simulations were performed are shown in black. Industrially used reference compound (MDEA) is shown in green.

As shown in Fig. 2b, the most rapidly absorbing compound according to the MD calculations and the data from Chowdhury et al.20 was 3-(Diethylamino)-1,2-propanediol (DEA-1,2-PD). However, most of the other compounds with the largest predicted rates of absorption (RMD) contained either piperidine or pyrrolidine cycles. This is in line with the data from Chowdhury et al.20, who showed that 3-piperidino-1,2-propanediol (3PP-1,2-PD) and 1-methyl-2-piperidineethanol were significantly faster than the industrially used methyldiethanolamine (MDEA). Figure 2c illustrates that the computed CO2 absorption Gibbs free energies ΔGMD are almost perfectly correlated with the CO2 absorption rates, RMD (Spearman ρ −0.98): the slower the CO2 absorption, the higher the absorption Gibbs free energy. The correlation is not linear, and the decrease of ΔGMD slows down significantly at higher CO2 absorption rates.

Virtual screening of tertiary amines and experimental validation

Machine-learning algorithms were applied to establish quantitative structure–property relationships and screen a set of tertiary amines from a public dataset. The values of pKa predicted by the OPERA model25 can be used as a rather good predictor for ΔGMD. Indeed, the fitting of linear regression with the pKa values as the only predictor leads to a reasonable predictive performance in cross-validation (Supplementary Table 2). For modeling both end-points (ΔGMD and RMD), we implemented a machine-learning workflow combining several machine-learning algorithms and various descriptors of molecular structures. Thus, predicted pKa values were complemented with other descriptor types: physicochemical descriptors from OPERA and various types of molecular fragments calculated using ISIDA-Fragmentor26,27. Finally, we used a consensus of several individual models built with the help of random forest (RF)28 and eXtreme Gradient Boosting (XGBoost))29 machine-learning algorithms on a merged subset of ISIDA fragments and descriptors generated with the OPERA tool. Although the predictive accuracy in terms of RMSE is of the same order of magnitude as in Kuenemann et al.13 for absorption rates (Supplementary Table 2 and Supplementary Fig. 1), the applicability domain of our models is much larger, since the training set contained three times more compounds. It is worth noting that a QSPR model which did not allow one to achieve an excellent accuracy can still be useful for ranking the amines from the large compounds databases13,30. Therefore, we retrieved from the public database ZINC23 the tertiary amines which were not too large (Mw ≤ 250 gmol−1), not too lipophilic (−1 ≤ clogP ≤1), and readily available from suppliers. In total, more than 800 amines were screened virtually. Numerous amines outranking MDEA in terms of the predicted absorption rates (RQSPR) were identified (Fig. 3a). For example, various substituted piperidines were among the compounds with the largest RQSPR (Fig. 3a).

Fig. 3: Virtual screening of tertiary amines and experimental validation.
figure 3

a Absorption rates predicted by the QSPR model (RQSPR) vs predicted pKa values. b RQSPR vs experimentally measured absorption rate (r(CO2)). c r(CO2) vs predicted pKa. d Free energies of absorption predicted by QSPR model (ΔGQSPR) vs predicted pKa. e ΔGQSPR vs r(CO2). f ΔGQSPR vs RQSPR. Amines present in the initial dataset from Chowdhury et al.19,20 are shown in orange. Amines selected for MD simulations in the present work are shown in black. The industrially used reference compound (MDEA) is shown in green. Eight novel amines which were not present in the training set are shown in blue. The CAS numbers of the most perspective compounds are shown.

Experimental measurement of the CO2 absorption kinetics

An experimental setup was put in place to measure and compare the rate of CO2 absorption in aqueous tertiary amines. For each experiment, the same initial amount of CO2 was set in contact with the solvent and the evolution toward equilibrium of the partial pressure of CO2 in the gas phase was measured over time. The slope of the absorption curve at the time at which 50% of the CO2 was absorbed (with respect to the equilibrium value) was calculated (r(CO2)). It is a measure of the rate of CO2 absorption. Eighteen amines comprising 7 amines from the initial set of 24 amines from Chowdhury et al.20, 3 amines from the diverse dataset of 100 amines, and 8 novel amines that were never present in the training set were purchased and an assessment of their absorption rate was performed (Fig. 3b, c, e and Supplementary Tables 3 and 4). Both ΔGQSPR and absorption rates RQSPR were highly correlated with r(CO2) for eight novel amines (Spearman ρ 0.93) as well as the predicted pKa values. Five out of eight purchased amines absorbed CO2 faster than MDEA. Two amines: 1-methyl- and 1-ethyl-3-pyrrolidinol (EPOL) were especially effective. These compounds represent an interesting class of the tertiary amines, which to our knowledge have not been explored yet.

While tertiary amines like the standard MDEA are often used for high-pressure natural gas treatment, they are not suitable for low-pressure anthropogenic CO2 removal due to the low CO2 absorption rate. Activators such as piperazine can be added to enhance the CO2 absorption rate. The impact of piperazine is shown in Fig. 4 for two amines, namely MDEA and EPOL. The latter is a tertiary amine that has been selected for its fast CO2 absorption rate following the virtual screening. In the absence of piperazine EPOL absorbs CO2 much faster than MDEA. The addition of piperazine significantly enhances the CO2 absorption rates with EPOL + PZ showing the fastest absorption.

Fig. 4: Experimental kinetic measurements with piperazine.
figure 4

Experimentally measured CO2 absorption rate (r(CO2)) of standard MDEA and EPOL, a new amine suggested in this work, and their mixtures with piperazine (+PZ). Aqueous alkanolamine mixtures contain 13 mol% amine and water and mixtures with PZ contain 11 mol% amine, 2.5 mol% PZ and water. Standard deviations of the values are shown as error bars.

In conclusion, a methodology for computer-aided design of tertiary amines effectively absorbing CO2 was suggested in this paper. The methodology is based on the combination of state-of-the-art molecular dynamics simulations that generate a sufficiently large dataset that are used as an input for machine-learning modelling followed by large-scale virtual screening. In parallel, the approach is experimentally validated. It allowed the identification of amines that absorb CO2 faster than those currently used in the industry. Since the development of an optimal solvent is a multi-objective task, we believe that the proposed methodology can be provisionally repurposed to application for modeling of other industrially important properties of alkanolamine-based solvents.

Methods

Molecular simulations

The approach developed recently and described in Rozanska et al.19, was used to compute the rates of CO2 absorption in aqueous amine solvents (see Supplementary Methods), which relies primarily on the solvation properties of OH, CO2, and HCO3. In this model, the tertiary amine solely acts as a base.

$${R}_{{{{{{\rm{MD}}}}}}}=A\left(T\right)\times {{\exp }}\left(\frac{{-\triangle G}^{\ne }}{{RT}}\right)\times \left[{{{{{{\rm{CO}}}}}}}_{2}\right][{{{{{{\rm{OH}}}}}}}^{-}]$$
(1)

The rates are obtained from Eq. (1) where RMD is the absorption rate, [CO2] and [OH] are the concentrations of carbon dioxide molecules and hydroxyl anions, respectively, ΔG is the Gibbs free energy barrier of the reaction CO2 + OH to HCO3, RT is the macroscopic thermodynamic energy unit, where R is the universal gas constant and T the absolute temperature, and A(T) is a temperature-dependent pre-exponential factor. In Eq. (1), ΔG is obtained from a Polanyi–Evans relation with as input the energy differences of solvation of OH + CO2 (reactants) and HCO3 (product) computed in the 124 aqueous amine solvents. The concentrations [CO2][OH] are obtained numerically solving pH equations, and A(T) is fitted using the experimental rates of the reaction CO2 + OH in ten aqueous amine solvents. The Polanyi–Evans relation between ΔG and energy differences of solvation, ΔG, of OH + CO2 and HCO3 is given by Eq. (2).

$$\Delta {G}^{{{\ddagger}} }=a\Delta G(T)+b$$
(2)

where a and b are fitted to reproduce the experimental rates in pure water and ten aqueous amine solvents and ΔG(T) is the energy difference of solvation of OH + CO2 and HCO3 obtained from molecular dynamics simulations. Additional details and the values for A(T), a, and b can be found in Rozanska et al.19.

For the calculation of the regeneration energy, the following three reactions are considered:

$${{{{{{\rm{OH}}}}}}}^{-}+{{{{{{\rm{CO}}}}}}}_{2}={{{{{{{\rm{HCO}}}}}}}_{3}}^{-}$$
(3)
$${{{{{\rm{Amine}}}}}}+{{{{{{\rm{H}}}}}}}_{2}{{{{{\rm{O}}}}}}+{{{{{{\rm{CO}}}}}}}_{2}={{{{{\rm{ammonium}}}}}}+{{{{{{{\rm{HCO}}}}}}}_{3}}^{-}$$
(4)
$${{{{{\rm{Amine}}}}}}+{{{{{{\rm{H}}}}}}}_{2}{{{{{\rm{O}}}}}}={{{{{\rm{ammonium}}}}}}+{{{{{{\rm{OH}}}}}}}^{-}$$
(5)

The free energy of absorption is ΔG4 (=ΔGMD in Fig. 2) = ΔG3 + ΔG5 with ΔG3 calculated from the molecular simulations (ΔG(T) in Eq. (2)) in every aqueous amine and ΔG5 calculated from the amine pKa.

Quantitative structure–property relationship modeling

All compound structures were standardized using RDKit31 nodes in KNIME32. The standardization procedure included aromatization, stereochemistry depletion, removal of salts/solvents, neutralization, removal of explicit hydrogens. Standardized structures for 124 amines are given in Supplementary Table 1 and at https://github.com/AxelRolov/CO2_chemical_solvents.

In all, 193 different ISIDA fragment descriptors were generated using the Fragmentor17 software26,27. These fragments represent either sequences (the shortest topological paths with an explicit presentation of all atoms and bonds), atom pairs, or triplets (all the possible combinations of three atoms in a graph with the topological distance between each pair indicated).

Various physicochemical properties (pKa, logP, melting and boiling points, vapor pressure, water solubility, etc.) and several substructural fragments counts (ring count, heavy atom count, etc.) used as descriptors, were calculated using OPERA v.2.625.

All descriptors used in this work are available at https://github.com/AxelRolov/CO2_chemical_solvents.

Prior to the application of machine-learning algorithms RMD and ΔGMD values were transformed to a logarithmic scale, i.e., the negative value of decimal logarithm was taken (−log10RMD, −log10(−ΔGMD)).

Random forest (RF): RF algorithm28 implemented in sci-kit learn library (v. 0.22.1)33, was used. The following hyperparameters were optimized (grid search): number of trees (100, 300, 1000), number of features (all features, one-third of all features, log2 of the number of features), the maximum depth of the tree (10, 30, full tree), bootstrapping (with and without the usage of bootstrap samples for building the tree).

XGBoost (XGB): XGBoost algorithm29 as implemented in XGBoost python module (v.1.2.0; https://xgboost.readthedocs.io/en/latest/python/python_intro.html) was used. The following hyperparameters were tuned during optimization (grid search): number of trees (50, 100, 300, 500), number of features (all features, 70% of all features), number of samples (all samples, 70% of all samples), the maximum depth of the tree (5, 20, full tree), learning rate (0.3, 0.1, 0.5, 0.05). All other parameters were left as default.

Support vector regression (SVR): SVR algorithm34 implemented in sci-kit learn library (v. 0.22.1), was used. The descriptors were scaled to the [0,1] range before applying the algorithm. The following hyperparameters were tuned during optimization (grid search): kernel (linear, rbf, poly, sigmoid), kernel coefficient (1, 0.1, 0.01, 0.001, 0.0001), regularization parameter (0.1, 1, 10, 100, 1000).

The modeling workflow was implemented using the sci-kit learn library (v. 0.22.1) in Python 3.7 scripting language (Supplementary Fig. 2). Identical modeling workflows were used for modeling absorption rates (RMD) and energies of absorption (ΔGMD). The values were expressed as negative logarithms of base 10. At the first stage of the modeling, a machine-learning algorithm: RF, SVR, and XGB were tested in fivefold cross-validation, which was repeated five times. For each descriptor set, the model’s measures of performance were calculated and several models with a coefficient of determination Q2CV ≥ 0.6 for (RMD) and Q2CV ≥ 0.7 for (ΔGMD) were selected for consensus modeling. Consensus models were built for each descriptor type separately. In order to assess a propensity to predict data never seen during the training of the model, a nested cross-validation procedure35 has been implemented. Here the method hyperparameters were found by optimizing the model performance in the fivefold cross-validation inner loop, while prediction was made for the test set from the outer loop, which represent a fold of the outer fivefold cross-validation cycle. To avoid a bias with the compounds numbering in the parent set, this procedure was repeated five times after reshuffling of the compounds. In such a way, the overall performance of the model (Q2NCV, RMSENCV, MAENCV) were estimated as an average of related statistical parameters obtained for each (out of 5) individual cross-validation loop.

Equations (68) were used to calculate the measures of the model’s performance in cross-validation:

$${Q}_{{{{{{\rm{CV}}}}}}}^{2}=\frac{{\sum }_{j=1}^{5}(1-\frac{{\sum }_{i=1}^{n}{({y}_{i,{\exp }}-{y}_{i,{pred}})}^{2}}{{\sum }_{i=1}^{n}{({y}_{i,{\exp }}-\bar{{{{{{\rm{y}}}}}}})}^{2}})}{5}$$
(6)
$${{RMSE}}_{{{{{{\rm{CV}}}}}}}=\frac{{\sum }_{j=1}^{5}\sqrt{{\sum }_{i=1}^{n}\frac{{({y}_{i,{\exp }}-{y}_{i,{pred}})}^{2}}{n}}}{5}$$
(7)
$${{MAE}}_{{{{{{\rm{CV}}}}}}}=\frac{{\sum }_{j=1}^{5}{\sum }_{i=1}^{n}\frac{\left|{y}_{i,{\exp }}-{y}_{i,{pred}}\right|}{n}}{5}$$
(8)

Above, n is the number of compounds in the learning set, yi,exp, yi,pred experimental and values predicted in fivefold cross-validation for compound i from the learning set, j is the index of the repetition of the tenfold cross-validation procedure.

Each of the selected models was then associated with an Applicability Domain (AD), defined as a boundary box. The pool of selected models extracted from the given dataset can now be used as a consensus predictor, returning for each input solvent candidate a mean value of solubility estimates and its standard deviation, taken over the predictions returned by each model in the pool or, alternatively, over the predictions returned by only those models having the candidate within their AD.

Outlying data points were defined as the data points, for which absolute errors (|χexp−χpred | ) from cross-validation were larger than 2×RMSECV threshold.

The absence of chance correlation was checked through the Y-randomization procedure. A Y-randomization test was performed in the following way: −log10χ values (y values) were shuffled, models were built using shuffled values and the values from the corresponding cross-validation test set were calculated. This procedure was repeated 100 times for each fold and the maximum values of the out-of-bag coefficient of determination were reported.

A library for virtual screening was performed in the following way. At first, all compounds from ZINC database with molecular weight no larger than 250 g/mol and calculated logP in the range of (−1,1) were retrieved. Structures were standardized and then filtered. All compounds which did not contain tertiary amines, compounds, containing double bonds, aromatic rings, primary or secondary amine groups, ketones and sulfur-containing compounds except for thiols and thioethers were removed. Structures of screened compounds as well as predicted values are available at https://github.com/AxelRolov/CO2_chemical_solvents.

Experimental measurement of CO2 absorption rates

To measure the kinetics of absorption and desorption of acid gases in aqueous amine solutions, a thermoregulated constant interfacial area Lewis-type reactor cell was used36. The reactor (Supplementary Figs. 36) is equipped with an internal stirring system (magnetic stirrer) with the external motor. The operator needs to take care to select the speed of stirring without disturbing the interface (interface must be flat). Temperature is given by two platinum probes located at the upper and lower flanges (with the possibility to determine the gradient of temperature). The cell is immersed in a liquid bath. An electric resistor is introduced into the upper flange to control the gradient of temperature and avoid condensation of water and amine. Two capillary samplers are adapted to sample the vapor phase. The capillary samplers (ROLSI®, Armines’ patent) are capable of withdrawing and sending micro samples to a gas chromatograph without perturbing the equilibrium conditions over numerous samplings, thus leading to repeatable and reliable results. Analytical work was carried out using a gas chromatograph (PERICHROM model PR2100, France) equipped with a thermal conductivity detector (TCD) connected to a data software system. Helium is used as the carrier gas in this experiment. The model of the GC column is Porapak R (Porapak R 80/100 mesh, 1 m × 2 mm ID Silcosteel). Each ROLSI® sampler is connected to a TCD. A tube allows either to evacuate or to introduce CO2 from or into the cell. The kinetics of gas absorption are determined by recording the pressure drop through a calibrated pressure transducer. A computer equipped with data acquisition system records the pressure as a function of time.

The experimental procedure is the following:

The desired amount of solvent is introduced into the cell. The density obtained using a low-pressure vibrating tube densitometer (Anton Paar DSA 5000) is used to determine the exact mole number of solvent.

At least 5 bar of methane is added. We add methane because with this configuration, it is not possible to sample at pressures lower than GC carrier gas pressure.

CO2 is added from the thermal press. We record pressure and temperature before and after the loading (see Supplementary Fig. 7 as an example). It permits to calculate very accurately the mole number of CO2 introduced and so, we can estimate very accurately the loadings of CO2.

The experimental method36 is similar to the one used to calculate the solubility of CO2 in alkanolamine amine solution at equilibrium. The method considered is based on the “static-synthetic method”. More details concerning the method are presented in the Supplementary Methods.

During the absorption of the CO2, we take samples to follow the evolution of the vapor composition (and so CO2 partial pressure) as a function of time. When the equilibrium is reached (constant pressure and constant temperature), the vapor phase composition is determined.

We have used the GERG 2008 Equation of state37 implemented in REFPROP 10.038 to estimate the densities of the vapor phase which is a mixture of CO2 and CH4.

The calculation of the acid gas solubility in the solvent is based on mass balance.

The volume of the liquid phase is obtained by considering the mole number of solvent introduced and its density at the temperature of measurement.

$${V}^{{{{{{\rm{L}}}}}}}=\frac{{n}_{{solvent}}}{\rho \left({T}_{{{{{{\rm{cell}}}}}}}\right)}$$
(9)

Consequently, the volume of the vapor phase is calculated by difference between the total volume and the volume of the liquid phase.

$${V}^{{{{{{\rm{V}}}}}}}={V}^{{{{{{\rm{T}}}}}}}-{V}^{{{{{{\rm{L}}}}}}}$$
(10)

If the introduction of the solute doesn’t modify the level of the liquid interface in the equilibrium cell, we can consider Eq. (11).

$${V}^{L}=\pi {r}_{{{{{{\rm{cell}}}}}}}^{2}{h}_{{{{{{\rm{liq}}}}}}}$$
(11)

Where rcell is the radius of the equilibrium cell, hliq the level of the vapor liquid interface.

The mole number of solute in the vapor phase is calculated by considering the density of the gas at the temperature and pressure of solute (\({P}_{{{{{{\rm{Solute}}}}}}}={P}_{{{{{{\rm{cell}}}}}}}-{P}_{{{{{{\rm{solvent}}}}}}}^{{{{{{\rm{sat}}}}}}}\)). REFPROP v10.0 is used to calculate this density \({\rho }^{{{{{{\rm{V}}}}}}}\left({T}_{{{{{{\rm{cell}}}}}}},{P}_{{{{{{\rm{solute}}}}}}}\right)\). In the case of a mixture, the global composition needs to be considered \({\rho }^{{{{{{\rm{V}}}}}}}\left({T}_{{{{{{\rm{cell}}}}}}},{P}_{{{{{{\rm{solute}}}}}}},y\right)\).

The volume of the vapor phase is used to calculate the mole number of solute in the vapor phase (Eq. (12)).

$${n}^{{{{{{\rm{V}}}}}}}={V}^{{{{{{\rm{V}}}}}}}{\rho }^{{{{{{\rm{V}}}}}}}\left({T}_{{{{{{\rm{cell}}}}}}},{P}_{{{{{{\rm{solute}}}}}}}\right)$$
(12)

In the case of a mixture, the same equation is used to calculate the total mole number of solute in the vapor phase.

So, the mole number of solute in the liquid phase is determined by considering Eq. (13).

$${n}^{{{{{{\rm{L}}}}}}}={n}^{{{{{{\rm{T}}}}}}}-{n}^{{{{{{\rm{V}}}}}}}$$
(13)

In the case of the mixture, the mole number of each species is calculated by considering the global composition of the mixture (z) and the composition of the vapor phase (y), Eq. (14).

$${n}_{i}^{{{{{{\rm{L}}}}}}}={z}_{i}{n}^{{{{{{\rm{T}}}}}}}-{y}_{i}{n}^{{{{{{\rm{V}}}}}}}$$
(14)

The solubility is determined with Eq. (15).

$${x}_{i}=\frac{{n}_{i}}{\sum {n}_{j}}$$
(15)