Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Machine learning pipeline for battery state-of-health estimation


Lithium-ion batteries are ubiquitous in applications ranging from portable electronics to electric vehicles. Irrespective of the application, reliable real-time estimation of battery state of health (SOH) by on-board computers is crucial to the safe operation of the battery, ultimately safeguarding asset integrity. In this Article, we design and evaluate a machine learning pipeline for estimation of battery capacity fade—a metric of battery health—on 179 cells cycled under various conditions. The pipeline estimates battery SOH with an associated confidence interval by using two parametric and two non-parametric algorithms. Using segments of charge voltage and current curves, the pipeline engineers 30 features, performs automatic feature selection and calibrates the algorithms. When deployed on cells operated under the fast-charging protocol, the best model achieves a root-mean-squared error of 0.45%. This work provides insights into the design of scalable data-driven models for battery SOH estimation, emphasizing the value of confidence bounds around the prediction. The pipeline methodology combines experimental data with machine learning modelling and could be applied to other critical components that require real-time estimation of SOH.

A preprint version of the article is available at ArXiv.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: The constant current−constant voltage (CC−CV) charge protocol and extracted ageing segment of the curves for a Li-ion pouch cell.
Fig. 2: Prediction results with dNNe Group I cell number 38.
Fig. 3
Fig. 4: Prediction results with dNNe Group II cell number 1.
Fig. 5: Prediction results with dNNe Group III cell no. 5.

Data availability

The datasets used in this study are available at, for Group 1, and, for Group 2,, and for Group 3,

Code availability

Code for the data processing is available from the corresponding authors upon request. Code for the modelling work is available at


  1. 1.

    Curry, C. Lithium-ion battery costs and market: squeezed margins seek technology improvements & new business models. Bloomberg New Energy Finance (5 July 2017).

  2. 2.

    Bernhart, W. Challenges and opportunities in lithium-ion battery supply. In Future Lithium-ion Batteries 316−334 (Royal Society of Chemistry, 2019).

  3. 3.

    You, G.-W., Park, S. & Oh, D. Diagnosis of electric vehicle batteries using recurrent neural networks. IEEE Trans. Indust. Electron. 64, 4885–4893 (2017).

    Google Scholar 

  4. 4.

    Barré, A. et al. A review on lithium-ion battery ageing mechanisms and estimations for automotive applications. J. Power Sources 241, 680–689 (2013).

    Google Scholar 

  5. 5.

    Zhang, J. & Lee, J. A review on prognostics and health monitoring of li-ion battery. J. Power Sources 196, 6007–6014 (2011).

    Google Scholar 

  6. 6.

    Farmann, A., Waag, W., Marongiu, A. & Sauer, D. U. Critical review of on-board capacity estimation techniques for lithium-ion batteries in electric and hybrid electric vehicles. J. Power Sources 281, 114–130 (2015).

    Google Scholar 

  7. 7.

    Hannan, M. A., Lipu, M. H., Hussain, A. & Mohamed, A. A review of lithium-ion battery state of charge estimation and management system in electric vehicle applications: challenges and recommendations. Renew. Sustain. Energy Rev. 78, 834–854 (2017).

    Google Scholar 

  8. 8.

    Hu, X., Li, S. & Peng, H. A comparative study of equivalent circuit models for Li-ion batteries. J. Power Sources 198, 359–367 (2012).

    Google Scholar 

  9. 9.

    Feng, T., Yang, L., Zhao, X., Zhang, H. & Qiang, J. Online identification of lithium-ion battery parameters based on an improved equivalent-circuit model and its implementation on battery state-of-power prediction. J. Power Sources 281, 192–203 (2015).

    Google Scholar 

  10. 10.

    Andre, D. et al. Characterization of high-power lithium-ion batteries by electrochemical impedance spectroscopy. II: Modelling. J. Power Sources 196, 5349–5356 (2011).

    Google Scholar 

  11. 11.

    Daigle, M. J. & Kulkarni, C. S. Electrochemistry-based battery modeling for prognostics. In Ann. Conf. Prognostics and Health Management Society 040 (PHM, 2013).

  12. 12.

    Bole, B., Kulkarni, C. S. & Daigle, M. Adaptation of an electrochemistry-based li-ion battery model to account fordeterioration observed under randomized use. In Proc. Ann. Conf. Prognostics and Health Management Society (PHM, 2014).

  13. 13.

    Prasad, G. K. & Rahn, C. D. Model based identification of aging parameters in lithium ion batteries. J. Power Sources 232, 79–85 (2013).

    Google Scholar 

  14. 14.

    Severson, K. A. et al. Data-driven prediction of battery cycle life before capacity degradation. Nat. Energy 4, 383−391 (2019).

    Google Scholar 

  15. 15.

    Saha, B., Goebel, K., Poll, S. & Christophersen, J. Prognostics methods for battery health monitoring using a Bayesian framework. IEEE Trans. Instrum. Measure. 58, 291–296 (2008).

    Google Scholar 

  16. 16.

    Goebel, K., Saha, B., Saxena, A., Celaya, J. R. & Christophersen, J. P. Prognostics in battery health management. IEEE Instrum. Measure. Mag. 11, 33–40 (2008).

    Google Scholar 

  17. 17.

    Hu, X., Jiang, J., Cao, D. & Egardt, B. Battery health prognosis for electric vehicles using sample entropy and sparse Bayesian predictive modeling. IEEE Trans. Indust. Electron. 63, 2645–2656 (2015).

    Google Scholar 

  18. 18.

    Klass, V., Behm, M. & Lindbergh, G. A support vector machine-based state-of-health estimation method for lithium-ion batteries under electric vehicle operation. J. Power Sources 270, 262–272 (2014).

    Google Scholar 

  19. 19.

    Attia, P. M. et al. Closed-loop optimization of fast-charging protocols for batteries with machine learning. Nature 578, 397–402 (2020).

    Google Scholar 

  20. 20.

    Coleman, M., Hurley, W. G. & Lee, C. K. An improved battery characterization method using a two-pulse load test. IEEE Trans. Energy Conv. 23, 708–713 (2008).

    Google Scholar 

  21. 21.

    Waag, W., Käbitz, S. & Sauer, D. U. Experimental investigation of the lithium-ion battery impedance characteristic at various conditions and aging states and its influence on the application. Appl. Energy 102, 885–897 (2013).

    Google Scholar 

  22. 22.

    Tröltzsch, U., Kanoun, O. & Tränkler, H.-R. Characterizing aging effects of lithium ion batteries by impedance spectroscopy. Electrochim. Acta 51, 1664–1672 (2006).

    Google Scholar 

  23. 23.

    Birkl, C. R., Roberts, M. R., McTurk, E., Bruce, P. G. & Howey, D. A. Degradation diagnostics for lithium ion cells. J. Power Sources 341, 373–386 (2017).

    Google Scholar 

  24. 24.

    Li, Y., Zhong, S., Zhong, Q. & Shi, K. Lithium-ion battery state of health monitoring based on ensemble learning. IEEE Access 7, 8754–8762 (2019).

    Google Scholar 

  25. 25.

    Li, Y. et al. Random forest regression for online capacity estimation of lithium-ion batteries. Appl. Energy 232, 197–210 (2018).

    Google Scholar 

  26. 26.

    Sun, B., Ren, P., Gong, M., Zhou, X. & Bian, J. SOH estimation for Li-ion batteries based on features of IC curves and multi-output Gaussian process regression method. DEStech Trans. Environ. Energy Earth Sci. (2018).

  27. 27.

    Feng, X. et al. Online state-of-health estimation for Li-ion battery using partial charging segment based on support vector machine. IEEE Trans. Vehic. Technol. 68, 8583–8592 (2019).

    Google Scholar 

  28. 28.

    Li, Y. et al. A quick on-line state of health estimation method for Li-ion battery with incremental capacity curves processed by Gaussian filter. J. Power Sources 373, 40–53 (2018).

    Google Scholar 

  29. 29.

    Dubarry, M., Svoboda, V., Hwu, R. & Liaw, B. Y. Incremental capacity analysis and close-to-equilibrium ocv measurements to quantify capacity fade in commercial rechargeable lithium batteries. Electrochem. Solid State Lett. 9, A454 (2006).

    Google Scholar 

  30. 30.

    Weng, C., Cui, Y., Sun, J. & Peng, H. On-board state of health monitoring of lithium-ion batteries using incremental capacity analysis with support vector regression. J. Power Sources 235, 36–44 (2013).

    Google Scholar 

  31. 31.

    Yang, D., Zhang, X., Pan, R., Wang, Y. & Chen, Z. A novel Gaussian process regression model for state-of-health estimation of lithium-ion battery using charging curve. J. Power Sources 384, 387–395 (2018).

    Google Scholar 

  32. 32.

    Richardson, R. R., Birkl, C. R., Osborne, M. A. & Howey, D. A. Gaussian process regression for in situ capacity estimation of lithium-ion batteries. IEEE Trans. Indust. Inform. 15, 127–138 (2018).

    Google Scholar 

  33. 33.

    Shen, Y., Seeger, M. & Ng, A. Y. Fast Gaussian process regression using KD-trees. In Adv. Neural Information Processing Systems (NIPS) 1225−1232 (2006).

  34. 34.

    Saha, B., Poll, S., Goebel, K. & Christophersen, J. An integrated approach to battery health monitoring using Bayesian regression and state estimation. In 2007 IEEE Autotestcon 646−653 (IEEE, 2007).

  35. 35.

    Ben-Shimon, D. & Shmilovici, A. Accelerating the relevance vector machine via data partitioning. Found. Comput. Decision Sci. 31, 27–42 (2006).

    MathSciNet  MATH  Google Scholar 

  36. 36.

    Wang, Z., Zeng, S., Guo, J. & Qin, T. Remaining capacity estimation of lithium-ion batteries based on the constant voltage charging profile. PLoS ONE 13, e0200169 (2018).

    Google Scholar 

  37. 37.

    Engel, S. J., Gilmartin, B. J., Bongort, K. & Hess, A. Prognostics, the real issues involved with predicting life remaining. In 2000 IEEE Aerospace Conf. Proc. 00TH8484, Vol. 6, 457−469 (IEEE, 2000).

  38. 38.

    Pomerantseva, E., Bonaccorso, F., Feng, X., Cui, Y. & Gogotsi, Y. Energy storage: the future enabled by nanomaterials. Science 366, eaan8285 (2019).

  39. 39.

    Seh, Z. W., Sun, Y., Zhang, Q. & Cui, Y. Designing high-energy lithium–sulfur batteries. Chem. Soc. Rev. 45, 5605–5634 (2016).

    Google Scholar 

  40. 40.

    Liu, G., Bao, H. & Han, B. A stacked autoencoder-based deep neural network for achieving gearbox fault diagnosis. Hindawi Math. Problems Eng. 2018, 5105709 (2018).

  41. 41.

    Kanter, J. M. & Veeramachaneni, K. Deep feature synthesis: towards automating data science endeavors. In 2015 IEEE Int. Conf. Data Sci. Adv. Analytics (DSAA) 1−10 (IEEE, 2015).

  42. 42.

    Williard, N., He, W., Osterman, M. & Pecht, M. Comparative analysis of features for determining state of health in lithium-ion batteries. Int. J. Prognostics Health Manage. 4, 1.7 (2013).

    Google Scholar 

  43. 43.

    Zhang, Y. & Guo, B. Online capacity estimation of lithium-ion batteries based on novel feature extraction and adaptive multi-kernel relevance vector machine. Energies 8, 12439−12457 (2015).

  44. 44.

    Guyon, I., Weston, J., Barnhill, S. & Vapnik, V. Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002).

    MATH  Google Scholar 

  45. 45.

    Darst, B. F., Malecki, K. C. & Engelman, C. D. Using recursive feature elimination in random forest to account for correlated variables in high dimensional data. BMC Genet. 19, 65 (2018).

    Google Scholar 

  46. 46.

    Gregorutti, B., Michel, B. & Saint-Pierre, P. Correlation and variable importance in random forests. Statist. Comput. 27, 659–678 (2017).

    MathSciNet  MATH  Google Scholar 

  47. 47.

    Goodfellow, I. J., Shlens, J. & Szegedy, C. Explaining and harnessing adversarial examples. Preprint at (2014).

  48. 48.

    Doyle, M., Fuller, T. F. & Newman, J. Modeling of galvanostatic charge and discharge of the lithium/polymer/insertion cell. J. Electrochem. Soc. 140, 1526 (1993).

    Google Scholar 

  49. 49.

    Wager, S., Hastie, T. & Efron, B. Confidence intervals for random forests: the jackknife and the infinitesimal jackknife. J. Machine Learning Res. 15, 1625–1651 (2014).

    MathSciNet  MATH  Google Scholar 

  50. 50.

    Lakshminarayanan, B., Pritzel, A. & Blundell, C. Simple and scalable predictive uncertainty estimation using deep ensembles. In Adv. Neural Information Processing Systems (NIPS) 6402−6413 (Curran Associates, 2017).

  51. 51.

    Bergstra, J. & Bengio, Y. Random search for hyper-parameter optimization. J. Machine Learning Res. 13, 281–305 (2012).

    MathSciNet  MATH  Google Scholar 

  52. 52.

    André, M. The Artemis European driving cycles for measuring car pollutant emissions. Sci. Total Environ. 334, 73–84 (2004).

    Google Scholar 

  53. 53.

    Markham, I. S. & Rakes, T. R. The effect of sample size and variability of data on the comparative performance of artificial neural networks and regression. Comput. Operations Res. 25, 251–263 (1998).

    MATH  Google Scholar 

  54. 54.

    Handoko, A. D., Wei, F., Yeo, B. S. & Seh, Z. W. et al. Understanding heterogeneous electrocatalytic carbon dioxide reduction through operando techniques. Nat. Catal. 1, 922–934 (2018).

    Google Scholar 

  55. 55.

    Jagielski, M. et al. Manipulating machine learning: poisoning attacks and countermeasures for regression learning. In 2018 IEEE Symp. on Security and Privacy (SP) 19−35 (IEEE, 2018).

  56. 56.

    Chen, P.-Y., Sharma, Y., Zhang, H., Yi, J. & Hsieh, C.-J. EAD: elastic-net attacks to deep neural networks via adversarial examples. In Proc. AAAI Conf. Artificial Intelligence Vol. 32 (AAAI, 2018).

  57. 57.

    Sharma, Y. & Chen, P.-Y. Attacking the Madry defense model with L1-based adversarial examples. Preprint at (2017).

  58. 58.

    Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Machine Learning Res. 12, 2825–2830 (2011).

    MathSciNet  MATH  Google Scholar 

  59. 59.

    Bishop, C. M. Pattern Recognition And Machine Learning (Springer, 2006).

  60. 60.

    Rasmussen, C. E. Gaussian processes in machine learning. In Summer School on Machine Learning 63−71 (Springer, 2003).

  61. 61.

    Breiman, L. Random forests. Machine Learning 45, 5–32 (2001).

    MATH  Google Scholar 

  62. 62.

    Kuleshov, V., Fenner, N. & Ermon, S. Accurate uncertainties for deep learning using calibrated regression. Preprint at (2018).

  63. 63.

    Chakravarti, N. Isotonic median regression: a linear programming approach. Math. Operations Res. 14, 303–308 (1989).

    MathSciNet  MATH  Google Scholar 

  64. 64.

    Saxena, A. et al. Metrics for evaluating performance of prognostic techniques. In 2008 Int. Conf. on Prognostics and Health Manage. 1−17 (IEEE, 2008).

Download references


This work was supported by the Lloyd’s Register Foundation (grant number AtRI_100015), The Engineering and Physical Sciences Research Council (EPSRC), the Center for Doctoral Training in Embedded Intelligence, and Baker Hughes (grant number EP/L014998/1). The work was further supported by the EPSRC through the UK National Centre for Energy Systems Integration (CESI) (grant number EP/P001173/1), and by InnovateUK through the Responsive Flexibility (ReFlex) (project reference 104780). We thank the more than 150 companies and organizations that support research activities at the Center for Advanced Life Cycle Engineering (CALCE) at the University of Maryland annually.

Author information




D.R. conceived the study, analysed the experimental data, developed the machine learning pipeline and wrote the paper, while S.S. assisted with experimental data interpretation, problem statement formulation, and feature engineering. V.R. provided technical input for the machine learning method development, while D.F. and M.P. provided input for the battery SOH application. V.R., M.P. and D.F. supervised the work. All authors commented on and reviewed the manuscript.

Corresponding author

Correspondence to Darius Roman.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review informationNature Machine Intelligence thanks the anonymous reviewers for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

This file contains Supplementary text, Figures and Tables.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Roman, D., Saxena, S., Robu, V. et al. Machine learning pipeline for battery state-of-health estimation. Nat Mach Intell 3, 447–456 (2021).

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing