Abstract
Carbon capture and storage technologies are projected to increasingly contribute to cleaner energy transitions by significantly reducing CO2 emissions from fossil fuel-driven power and industrial plants. The industry standard technology for CO2 capture is chemical absorption with aqueous alkanolamines, which are often being mixed with an activator, piperazine, to increase the overall CO2 absorption rate. Inefficiency of the process due to the parasitic energy required for thermal regeneration of the solvent drives the search for new tertiary amines with better kinetics. Improving the efficiency of experimental screening using computational tools is challenging due to the complex nature of chemical absorption. We have developed a novel computational approach that combines kinetic experiments, molecular simulations and machine learning for the in silico screening of hundreds of prospective candidates and identify a class of tertiary amines that absorbs CO2 faster than a typical commercial solvent when mixed with piperazine, which was confirmed experimentally.
Similar content being viewed by others
Introduction
Numerous technologies exist for capturing CO2 including chemical absorption, cryogenic separation, removal with membranes, and adsorption with zeolites or metal–organic frameworks1,2,3,4,5,6. The cyclic chemical absorption and regeneration process based on common primary and secondary amines such as monoethanolamine (MEA) and diethanolamine (DEA) is the most mature in industrial applications3,5. Unhindered primary and secondary amines react rapidly with CO2 to form very stable carbamates. The amount of energy required for the regeneration of these solvents is large. Carbon capture applied to a coal-fired power plant may reduce the net output of the plant by 30%6. With sterically hindered amines or tertiary amines like the standard methyldiethanolamine (MDEA), CO2 is captured as bicarbonate, which has a much smaller heat of reaction than carbamate formation, resulting in regeneration energy savings7. Moreover, their CO2 absorption capacity is much higher. Tertiary amines are therefore increasingly used in the high-pressure natural gas treatment industry to remove acid gases like CO2. However, in general, the rate of direct bicarbonate formation is much lower than that of carbamate formation resulting in much slower CO2 absorption rates with tertiary amines and thus in unacceptable large equipment for low pressure, anthropogenic (flue gas), CO2 capture applications5,7. To tackle this problem, several approaches were suggested. Several studies reported that the usage of a catalyst allows one to speed up the absorption of CO2 and/or to lower the energetic cost of solvent regeneration8. Another option, which is currently followed by the industry, consists in adding an activator, piperazine, significantly boosting the overall CO2 absorption rate without increasing the regeneration energy too much9. A more straightforward strategy would be the identification of new tertiary amines with much higher absorption rates with respect to standard MDEA and to which piperazine can eventually be added. Since experimental measurement of CO2 absorption kinetics is a time and labor-intensive process, the rational approach to the design of tertiary amines that can rapidly absorb CO2 requires a quantitative model enabling to select only the best candidates for experimental measurements.
Concerning alternative processes based on adsorption in porous solids (still under development), a lower theoretical energy consumption is expected due to the weaker physical adsorption. Molecular simulations and machine learning have already been extensively used to perform virtual screening of hundreds of thousands of structures to identify potentially better materials for CO2 adsorption10,11. Until now it was not possible to apply a similar methodology for amines, because of the difficulty related to the computation of chemical reactions. Amines were rationally designed based on physical and thermodynamic properties and the CO2 absorption rates were measured experimentally for only the most promising candidates7,12. Previously, machine-learning algorithms were tentatively applied for modeling quantitative structure–property relationship (QSPR) of alkanolamines’ CO2 absorption-related properties13,14,15,16,17,18. However, the availability of only a very small amount of data points limited the applicability domain of the models. Hence, to address this issue, we developed and applied a methodology for the identification of tertiary amines effectively absorbing CO2 based on the combination of molecular simulations19 and machine learning. In parallel, an experimental setup for the measurement of CO2 absorption rates has been specifically designed and put in place to validate the approach.
Results and discussion
Design of the methodology for CO2 absorbents screening
The workflow of the methodology is presented in Fig. 1. Chowdhury et al.20 published a consistent experimental dataset of the absorption rates of CO2 for 24 aqueous tertiary amines (313 K, 30 wt% amine). In the absence of a clear relationship between the structure or the chemical properties (e.g., the basicity) of the amines and the CO2 absorption rates, we developed a molecular dynamics (MD) based model that can accurately predict those experimental CO2 absorption rates19. It was found that, while the basicity of the amine (quantified by the pKa) is important, the key to the precision of molecular simulations is the inclusion of subtle but important solvation effects in the calculation of the activation Gibbs free energy of the reaction with an accuracy better than 1 kJ mol−1. One of the important features of the MD model19 is the robustness to reasonable changes in the concentration of amine and in temperature, enabling to apply it to a rather wide range of experimental setups. Hence, the model was applied to predict the rates at 13 mol% of amines and at 323 K, because these conditions are more representative of industrial absorption5.
Being much less resource- and cost-demanding, molecular simulations can thus be used instead of the experiments to get enough data for building a reliable QSPR model with a wide applicability domain.
Molecular simulations of CO2 absorption process
A dataset containing 100 structurally diverse tertiary amines was composed based on the in-house TotalEnergies’s dataset of amines with known experimental properties, complemented with tertiary amines extracted from literature and public databases (PubChem21,22, ZINC23,24). The selected compounds comprise diverse chemotypes, including linear and cyclic amines, diamines, amines containing thiol and thioether groups. Molecular simulations (see “Methods”) were performed for the initial set of 24 amines and for the selected set of 100 amines at 323 K and using a 13 mol% concentration of amine. From MD simulations absorption rates (RMD) and free energies of absorption (ΔGMD) were obtained. Notably, the RMD values calculated at 313 and 323 K are highly correlated (Fig. 2a, Spearman rank correlation coefficient (ρ) 0.99).
As shown in Fig. 2b, the most rapidly absorbing compound according to the MD calculations and the data from Chowdhury et al.20 was 3-(Diethylamino)-1,2-propanediol (DEA-1,2-PD). However, most of the other compounds with the largest predicted rates of absorption (RMD) contained either piperidine or pyrrolidine cycles. This is in line with the data from Chowdhury et al.20, who showed that 3-piperidino-1,2-propanediol (3PP-1,2-PD) and 1-methyl-2-piperidineethanol were significantly faster than the industrially used methyldiethanolamine (MDEA). Figure 2c illustrates that the computed CO2 absorption Gibbs free energies ΔGMD are almost perfectly correlated with the CO2 absorption rates, RMD (Spearman ρ −0.98): the slower the CO2 absorption, the higher the absorption Gibbs free energy. The correlation is not linear, and the decrease of ΔGMD slows down significantly at higher CO2 absorption rates.
Virtual screening of tertiary amines and experimental validation
Machine-learning algorithms were applied to establish quantitative structure–property relationships and screen a set of tertiary amines from a public dataset. The values of pKa predicted by the OPERA model25 can be used as a rather good predictor for ΔGMD. Indeed, the fitting of linear regression with the pKa values as the only predictor leads to a reasonable predictive performance in cross-validation (Supplementary Table 2). For modeling both end-points (ΔGMD and RMD), we implemented a machine-learning workflow combining several machine-learning algorithms and various descriptors of molecular structures. Thus, predicted pKa values were complemented with other descriptor types: physicochemical descriptors from OPERA and various types of molecular fragments calculated using ISIDA-Fragmentor26,27. Finally, we used a consensus of several individual models built with the help of random forest (RF)28 and eXtreme Gradient Boosting (XGBoost))29 machine-learning algorithms on a merged subset of ISIDA fragments and descriptors generated with the OPERA tool. Although the predictive accuracy in terms of RMSE is of the same order of magnitude as in Kuenemann et al.13 for absorption rates (Supplementary Table 2 and Supplementary Fig. 1), the applicability domain of our models is much larger, since the training set contained three times more compounds. It is worth noting that a QSPR model which did not allow one to achieve an excellent accuracy can still be useful for ranking the amines from the large compounds databases13,30. Therefore, we retrieved from the public database ZINC23 the tertiary amines which were not too large (Mw ≤ 250 gmol−1), not too lipophilic (−1 ≤ clogP ≤1), and readily available from suppliers. In total, more than 800 amines were screened virtually. Numerous amines outranking MDEA in terms of the predicted absorption rates (RQSPR) were identified (Fig. 3a). For example, various substituted piperidines were among the compounds with the largest RQSPR (Fig. 3a).
Experimental measurement of the CO2 absorption kinetics
An experimental setup was put in place to measure and compare the rate of CO2 absorption in aqueous tertiary amines. For each experiment, the same initial amount of CO2 was set in contact with the solvent and the evolution toward equilibrium of the partial pressure of CO2 in the gas phase was measured over time. The slope of the absorption curve at the time at which 50% of the CO2 was absorbed (with respect to the equilibrium value) was calculated (r(CO2)). It is a measure of the rate of CO2 absorption. Eighteen amines comprising 7 amines from the initial set of 24 amines from Chowdhury et al.20, 3 amines from the diverse dataset of 100 amines, and 8 novel amines that were never present in the training set were purchased and an assessment of their absorption rate was performed (Fig. 3b, c, e and Supplementary Tables 3 and 4). Both ΔGQSPR and absorption rates RQSPR were highly correlated with r(CO2) for eight novel amines (Spearman ρ 0.93) as well as the predicted pKa values. Five out of eight purchased amines absorbed CO2 faster than MDEA. Two amines: 1-methyl- and 1-ethyl-3-pyrrolidinol (EPOL) were especially effective. These compounds represent an interesting class of the tertiary amines, which to our knowledge have not been explored yet.
While tertiary amines like the standard MDEA are often used for high-pressure natural gas treatment, they are not suitable for low-pressure anthropogenic CO2 removal due to the low CO2 absorption rate. Activators such as piperazine can be added to enhance the CO2 absorption rate. The impact of piperazine is shown in Fig. 4 for two amines, namely MDEA and EPOL. The latter is a tertiary amine that has been selected for its fast CO2 absorption rate following the virtual screening. In the absence of piperazine EPOL absorbs CO2 much faster than MDEA. The addition of piperazine significantly enhances the CO2 absorption rates with EPOL + PZ showing the fastest absorption.
In conclusion, a methodology for computer-aided design of tertiary amines effectively absorbing CO2 was suggested in this paper. The methodology is based on the combination of state-of-the-art molecular dynamics simulations that generate a sufficiently large dataset that are used as an input for machine-learning modelling followed by large-scale virtual screening. In parallel, the approach is experimentally validated. It allowed the identification of amines that absorb CO2 faster than those currently used in the industry. Since the development of an optimal solvent is a multi-objective task, we believe that the proposed methodology can be provisionally repurposed to application for modeling of other industrially important properties of alkanolamine-based solvents.
Methods
Molecular simulations
The approach developed recently and described in Rozanska et al.19, was used to compute the rates of CO2 absorption in aqueous amine solvents (see Supplementary Methods), which relies primarily on the solvation properties of OH–, CO2, and HCO3−. In this model, the tertiary amine solely acts as a base.
The rates are obtained from Eq. (1) where RMD is the absorption rate, [CO2] and [OH–] are the concentrations of carbon dioxide molecules and hydroxyl anions, respectively, ΔG⧧ is the Gibbs free energy barrier of the reaction CO2 + OH– to HCO3–, RT is the macroscopic thermodynamic energy unit, where R is the universal gas constant and T the absolute temperature, and A(T) is a temperature-dependent pre-exponential factor. In Eq. (1), ΔG⧧ is obtained from a Polanyi–Evans relation with as input the energy differences of solvation of OH– + CO2 (reactants) and HCO3– (product) computed in the 124 aqueous amine solvents. The concentrations [CO2][OH−] are obtained numerically solving pH equations, and A(T) is fitted using the experimental rates of the reaction CO2 + OH− in ten aqueous amine solvents. The Polanyi–Evans relation between ΔG⧧ and energy differences of solvation, ΔG, of OH– + CO2 and HCO3– is given by Eq. (2).
where a and b are fitted to reproduce the experimental rates in pure water and ten aqueous amine solvents and ΔG(T) is the energy difference of solvation of OH– + CO2 and HCO3– obtained from molecular dynamics simulations. Additional details and the values for A(T), a, and b can be found in Rozanska et al.19.
For the calculation of the regeneration energy, the following three reactions are considered:
The free energy of absorption is ΔG4 (=ΔGMD in Fig. 2) = ΔG3 + ΔG5 with ΔG3 calculated from the molecular simulations (ΔG(T) in Eq. (2)) in every aqueous amine and ΔG5 calculated from the amine pKa.
Quantitative structure–property relationship modeling
All compound structures were standardized using RDKit31 nodes in KNIME32. The standardization procedure included aromatization, stereochemistry depletion, removal of salts/solvents, neutralization, removal of explicit hydrogens. Standardized structures for 124 amines are given in Supplementary Table 1 and at https://github.com/AxelRolov/CO2_chemical_solvents.
In all, 193 different ISIDA fragment descriptors were generated using the Fragmentor17 software26,27. These fragments represent either sequences (the shortest topological paths with an explicit presentation of all atoms and bonds), atom pairs, or triplets (all the possible combinations of three atoms in a graph with the topological distance between each pair indicated).
Various physicochemical properties (pKa, logP, melting and boiling points, vapor pressure, water solubility, etc.) and several substructural fragments counts (ring count, heavy atom count, etc.) used as descriptors, were calculated using OPERA v.2.625.
All descriptors used in this work are available at https://github.com/AxelRolov/CO2_chemical_solvents.
Prior to the application of machine-learning algorithms RMD and ΔGMD values were transformed to a logarithmic scale, i.e., the negative value of decimal logarithm was taken (−log10RMD, −log10(−ΔGMD)).
Random forest (RF): RF algorithm28 implemented in sci-kit learn library (v. 0.22.1)33, was used. The following hyperparameters were optimized (grid search): number of trees (100, 300, 1000), number of features (all features, one-third of all features, log2 of the number of features), the maximum depth of the tree (10, 30, full tree), bootstrapping (with and without the usage of bootstrap samples for building the tree).
XGBoost (XGB): XGBoost algorithm29 as implemented in XGBoost python module (v.1.2.0; https://xgboost.readthedocs.io/en/latest/python/python_intro.html) was used. The following hyperparameters were tuned during optimization (grid search): number of trees (50, 100, 300, 500), number of features (all features, 70% of all features), number of samples (all samples, 70% of all samples), the maximum depth of the tree (5, 20, full tree), learning rate (0.3, 0.1, 0.5, 0.05). All other parameters were left as default.
Support vector regression (SVR): SVR algorithm34 implemented in sci-kit learn library (v. 0.22.1), was used. The descriptors were scaled to the [0,1] range before applying the algorithm. The following hyperparameters were tuned during optimization (grid search): kernel (linear, rbf, poly, sigmoid), kernel coefficient (1, 0.1, 0.01, 0.001, 0.0001), regularization parameter (0.1, 1, 10, 100, 1000).
The modeling workflow was implemented using the sci-kit learn library (v. 0.22.1) in Python 3.7 scripting language (Supplementary Fig. 2). Identical modeling workflows were used for modeling absorption rates (RMD) and energies of absorption (ΔGMD). The values were expressed as negative logarithms of base 10. At the first stage of the modeling, a machine-learning algorithm: RF, SVR, and XGB were tested in fivefold cross-validation, which was repeated five times. For each descriptor set, the model’s measures of performance were calculated and several models with a coefficient of determination Q2CV ≥ 0.6 for (RMD) and Q2CV ≥ 0.7 for (ΔGMD) were selected for consensus modeling. Consensus models were built for each descriptor type separately. In order to assess a propensity to predict data never seen during the training of the model, a nested cross-validation procedure35 has been implemented. Here the method hyperparameters were found by optimizing the model performance in the fivefold cross-validation inner loop, while prediction was made for the test set from the outer loop, which represent a fold of the outer fivefold cross-validation cycle. To avoid a bias with the compounds numbering in the parent set, this procedure was repeated five times after reshuffling of the compounds. In such a way, the overall performance of the model (Q2NCV, RMSENCV, MAENCV) were estimated as an average of related statistical parameters obtained for each (out of 5) individual cross-validation loop.
Equations (6–8) were used to calculate the measures of the model’s performance in cross-validation:
Above, n is the number of compounds in the learning set, yi,exp, yi,pred experimental and values predicted in fivefold cross-validation for compound i from the learning set, j is the index of the repetition of the tenfold cross-validation procedure.
Each of the selected models was then associated with an Applicability Domain (AD), defined as a boundary box. The pool of selected models extracted from the given dataset can now be used as a consensus predictor, returning for each input solvent candidate a mean value of solubility estimates and its standard deviation, taken over the predictions returned by each model in the pool or, alternatively, over the predictions returned by only those models having the candidate within their AD.
Outlying data points were defined as the data points, for which absolute errors (|χexp−χpred | ) from cross-validation were larger than 2×RMSECV threshold.
The absence of chance correlation was checked through the Y-randomization procedure. A Y-randomization test was performed in the following way: −log10χ values (y values) were shuffled, models were built using shuffled values and the values from the corresponding cross-validation test set were calculated. This procedure was repeated 100 times for each fold and the maximum values of the out-of-bag coefficient of determination were reported.
A library for virtual screening was performed in the following way. At first, all compounds from ZINC database with molecular weight no larger than 250 g/mol and calculated logP in the range of (−1,1) were retrieved. Structures were standardized and then filtered. All compounds which did not contain tertiary amines, compounds, containing double bonds, aromatic rings, primary or secondary amine groups, ketones and sulfur-containing compounds except for thiols and thioethers were removed. Structures of screened compounds as well as predicted values are available at https://github.com/AxelRolov/CO2_chemical_solvents.
Experimental measurement of CO2 absorption rates
To measure the kinetics of absorption and desorption of acid gases in aqueous amine solutions, a thermoregulated constant interfacial area Lewis-type reactor cell was used36. The reactor (Supplementary Figs. 3–6) is equipped with an internal stirring system (magnetic stirrer) with the external motor. The operator needs to take care to select the speed of stirring without disturbing the interface (interface must be flat). Temperature is given by two platinum probes located at the upper and lower flanges (with the possibility to determine the gradient of temperature). The cell is immersed in a liquid bath. An electric resistor is introduced into the upper flange to control the gradient of temperature and avoid condensation of water and amine. Two capillary samplers are adapted to sample the vapor phase. The capillary samplers (ROLSI®, Armines’ patent) are capable of withdrawing and sending micro samples to a gas chromatograph without perturbing the equilibrium conditions over numerous samplings, thus leading to repeatable and reliable results. Analytical work was carried out using a gas chromatograph (PERICHROM model PR2100, France) equipped with a thermal conductivity detector (TCD) connected to a data software system. Helium is used as the carrier gas in this experiment. The model of the GC column is Porapak R (Porapak R 80/100 mesh, 1 m × 2 mm ID Silcosteel). Each ROLSI® sampler is connected to a TCD. A tube allows either to evacuate or to introduce CO2 from or into the cell. The kinetics of gas absorption are determined by recording the pressure drop through a calibrated pressure transducer. A computer equipped with data acquisition system records the pressure as a function of time.
The experimental procedure is the following:
The desired amount of solvent is introduced into the cell. The density obtained using a low-pressure vibrating tube densitometer (Anton Paar DSA 5000) is used to determine the exact mole number of solvent.
At least 5 bar of methane is added. We add methane because with this configuration, it is not possible to sample at pressures lower than GC carrier gas pressure.
CO2 is added from the thermal press. We record pressure and temperature before and after the loading (see Supplementary Fig. 7 as an example). It permits to calculate very accurately the mole number of CO2 introduced and so, we can estimate very accurately the loadings of CO2.
The experimental method36 is similar to the one used to calculate the solubility of CO2 in alkanolamine amine solution at equilibrium. The method considered is based on the “static-synthetic method”. More details concerning the method are presented in the Supplementary Methods.
During the absorption of the CO2, we take samples to follow the evolution of the vapor composition (and so CO2 partial pressure) as a function of time. When the equilibrium is reached (constant pressure and constant temperature), the vapor phase composition is determined.
We have used the GERG 2008 Equation of state37 implemented in REFPROP 10.038 to estimate the densities of the vapor phase which is a mixture of CO2 and CH4.
The calculation of the acid gas solubility in the solvent is based on mass balance.
The volume of the liquid phase is obtained by considering the mole number of solvent introduced and its density at the temperature of measurement.
Consequently, the volume of the vapor phase is calculated by difference between the total volume and the volume of the liquid phase.
If the introduction of the solute doesn’t modify the level of the liquid interface in the equilibrium cell, we can consider Eq. (11).
Where rcell is the radius of the equilibrium cell, hliq the level of the vapor liquid interface.
The mole number of solute in the vapor phase is calculated by considering the density of the gas at the temperature and pressure of solute (\({P}_{{{{{{\rm{Solute}}}}}}}={P}_{{{{{{\rm{cell}}}}}}}-{P}_{{{{{{\rm{solvent}}}}}}}^{{{{{{\rm{sat}}}}}}}\)). REFPROP v10.0 is used to calculate this density \({\rho }^{{{{{{\rm{V}}}}}}}\left({T}_{{{{{{\rm{cell}}}}}}},{P}_{{{{{{\rm{solute}}}}}}}\right)\). In the case of a mixture, the global composition needs to be considered \({\rho }^{{{{{{\rm{V}}}}}}}\left({T}_{{{{{{\rm{cell}}}}}}},{P}_{{{{{{\rm{solute}}}}}}},y\right)\).
The volume of the vapor phase is used to calculate the mole number of solute in the vapor phase (Eq. (12)).
In the case of a mixture, the same equation is used to calculate the total mole number of solute in the vapor phase.
So, the mole number of solute in the liquid phase is determined by considering Eq. (13).
In the case of the mixture, the mole number of each species is calculated by considering the global composition of the mixture (z) and the composition of the vapor phase (y), Eq. (14).
The solubility is determined with Eq. (15).
Data availability
All the experimental data are available in Supplementary Materials and at https://github.com/AxelRolov/CO2_chemical_solvents. Structures of compounds, descriptors and predicted values are also available at https://github.com/AxelRolov/CO2_chemical_solvents. The data are also deposited into a DOI-minting repository ZENODO: https://doi.org/10.5281/zenodo.6010667.
Code availability
Jupyter notebooks containing the Python code used for model building, evaluation and virtual screening are available at https://github.com/AxelRolov/CO2_chemical_solvents. The code is also deposited into a DOI-minting repository ZENODO: https://doi.org/10.5281/zenodo.6010667. Python libraries used for machine-learning and OPERA software are freely available. ISIDA-Fragmentor is available upon reasonable request to Prof. Alexandre Varnek.
References
Birol, F., Cozzi, L., & Gül, T. Net Zero by 2050—Analysis. IEA https://www.iea.org/reports/net-zero-by-2050 (2021).
Hepburn, C. et al. The technological and economic prospects for CO2 utilization and removal. Nature 575, 87–97 (2019).
Bui, M. et al. Carbon capture and storage (CCS): the way forward. Energy Environ. Sci. 11, 1062–1176 (2018).
Rochelle, G. T. Amine scrubbing for CO2 capture. Science 325, 1652–1654 (2009).
Brickett, L. Carbon Dioxide Capture Handbook. (US Department of Energy (DOE)/NETL, 2015). https://www.netl.doe.gov/sites/default/files/netl-file/Carbon-Dioxide-Capture-Handbook-2015.pdf.
Smit, B. Carbon Capture and Storage: introductory lecture. Faraday Discuss 192, 9–25 (2016).
Borhani, T. N. & Wang, M. Role of solvents in CO2 capture processes: the review of selection and design methods. Renew. Sustain. Energy Rev. 114, 109299 (2019).
de Meyer, F. & Bignaud, C. The use of catalysis for faster CO2 absorption and energy-efficient solvent regeneration: an industry-focused critical review. Chem. Eng. J. 428, 131264 (2022).
Li, L. et al. Amine blends using concentrated piperazine. Energy Procedia 37, 353–369 (2013).
Lin, L.-C. et al. In silico screening of carbon-capture materials. Nat. Mater. 11, 633–641 (2012).
Boyd, P. G. et al. Data-driven design of metal–organic frameworks for wet flue gas CO2 capture. Nature 576, 253–256 (2019).
Conway, W. et al. Designer amines for post combustion CO2 capture processes. Energy Procedia 63, 1827–1834 (2014).
Kuenemann, M. A. & Fourches, D. Cheminformatics modeling of amine solutions for assessing their CO2 absorption properties. Mol. Inf. 36, 1600143 (2017).
Khaheshi, S., Riahi, S., Mohammadi-Khanaposhtani, M. & shokrollahzadeh, H. Prediction of amines capacity for carbon dioxide absorption based on structural characteristics. Ind. Eng. Chem. Res. 58, 8763–8771 (2019).
Rezaei, B., Riahi, S. & Gorji, A. E. Molecular investigation of amine performance in the carbon capture process: least squares support vector machine approach. Korean J. Chem. Eng. 37, 72–79 (2020).
Cheng, J. et al. Quantitative relationship between CO2 absorption capacity and amine water system: DFT, statistical, and experimental study. Ind. Eng. Chem. Res. 58, 13848–13857 (2019).
Gonfa, G., Bustam, M. A. & Shariff, A. M. Quantum-chemical-based quantitative structure-activity relationships for estimation of CO2 absorption/desorption capacities of amine-based absorbents. Int. J. Greenh. Gas. Control 49, 372–378 (2016).
Porcheron, F. et al. Graph machine based-QSAR approach for modeling thermodynamic properties of amines: application to CO2 capture in postcombustion. Oil Gas. Sci. Technol. – Rev. D’IFP Energ. Nouv. 68, 469–486 (2013).
Rozanska, X., Wimmer, E. & de Meyer, F. Quantitative kinetic model of CO2 absorption in aqueous tertiary amine solvents. J. Chem. Inf. Model. 61, 1814–1824 (2021).
Chowdhury, F. A., Yamada, H., Higashii, T., Goto, K. & Onoda, M. CO2 capture by tertiary amine absorbents: a performance comparison study. Ind. Eng. Chem. Res. 52, 8323–8331 (2013).
Kim, S. et al. PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res. 49, D1388–D1395 (2021).
Kim, S. et al. PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res. 49, D1388–D1395 (2021).
Sterling, T. & Irwin, J. J. ZINC 15—ligand discovery for everyone. J. Chem. Inf. Model. 55, 2324–2337 (2015).
John J. Irwin & Brian K. Shoichet. ZINC – A Free Database of Commercially Available Compounds for Virtual Screening J. Chem. Inf. Model. 45, 177–182 (2005).
Mansouri, K., Grulke, C. M., Judson, R. S. & Williams, A. J. OPERA models for predicting physicochemical properties and environmental fate endpoints. J. Cheminformatics 10, 10 (2018).
Varnek, A. et al. ISIDA—platform for virtual screening based on fragment and pharmacophoric descriptors. Curr. Comput. Aided-Drug Des. 4, 191–198 (2008).
Ruggiu, F., Marcou, G., Varnek, A. & Horvath, D. ISIDA property-labelled fragment descriptors. Mol. Inf. 29, 855–868 (2010).
Breiman, L. Random forests. Mach. Learn 45, 5–32 (2001).
Chen, T. & Guestrin, C. XGBoost: a scalable tree boosting system. in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (ACM, 2016).
Tropsha, A., Gramatica, P. & Gombar, V. K. The importance of being earnest: validation is the absolute essential for successful application and interpretation of QSPR models. QSAR Comb. Sci. 22, 69–77 (2003).
Landrum, G. RDKit: Open-source cheminformatics; http://www.rdkit.org (2021).
Berthold, M. R. et al. KNIME - the Konstanz Information Miner: Version 2.0 and Beyond SIGKDD Explor. Newsl. 11, 26–31 (ACM, New York, NY, USA, 2009).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273–297 (1995).
Baumann, D. & Baumann, K. Reliable estimation of prediction errors for QSAR models under model uncertainty using double cross-validation. J. Cheminformatics 6, 47 (2014).
Coquelet, C., Valtz, A. & Théveneau, P. Experimental Determination of Thermophysical Properties of Working Fluids for ORC Applications. Organic Rankine Cycles for Waste Heat Recovery - Analysis and Applications (IntechOpen, 2019).
Kunz, O. & Wagner, W. The GERG-2008 wide-range equation of state for natural gases and other mixtures: an expansion of GERG-2004. J. Chem. Eng. Data 57, 3032–3091 (2012).
Lemmon, E.W., Bell, I.H., Huber, M.L. & McLinden, M.O. NIST standard reference database 23: reference fluid thermodynamic and transport properties-refprop, version 10.0, national institute of standards and technology, standard reference data program, Gaithersburg, https://doi.org/10.18434/T4/1502528 (2018).
Acknowledgements
This work was supported by the Carbon Capture Utilization and Storage (CCUS) transverse R&D program from TotalEnergies S.E.
Author information
Authors and Affiliations
Contributions
A.A.O. performed machine learning, analyzed, interpreted the data, and contributed to the writing of the manuscript. X.R. and E.W. performed the molecular simulations. A.Valtz and C.C. performed the experimental part of the work. G.M. and D.H. contributed to the machine-learning models. B.P. contributed to the planning of the research. A. Varnek conceived, planned, and guided the part of the research related to building machine-learning models. F.D.M. conceived, planned, guided the research, analyzed, and interpreted the data, and wrote the manuscript. All authors critically analyzed data, edited, and approved the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Communications Chemistry thanks Agilio Padua and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Orlov, A.A., Valtz, A., Coquelet, C. et al. Computational screening methodology identifies effective solvents for CO2 capture. Commun Chem 5, 37 (2022). https://doi.org/10.1038/s42004-022-00654-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s42004-022-00654-y
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.