Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Technical Report
  • Published:

A machine-vision-based frailty index for mice

Abstract

Heterogeneity in biological aging manifests itself in health status and mortality. Frailty indices (FIs) capture health status in humans and model organisms. To accelerate our understanding of biological aging and carry out scalable interventional studies, high-throughput approaches are necessary. Here we introduce a machine-learning-based visual FI for mice that operates on video data from an open-field assay. We use machine vision to extract morphometric, gait and other behavioral features that correlate with FI score and age. We use these features to train a regression model that accurately predicts the normalized FI score within 0.04 ± 0.002 (mean absolute error). Unnormalized, this error is 1.08 ± 0.05, which is comparable to one FI item being mis-scored by 1 point or two FI items mis-scored by 0.5 points. This visual FI provides increased reproducibility and scalability that will enable large-scale mechanistic and interventional studies of aging in mice.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Approach overview to build a visual frailty index.
Fig. 2: Sample features used in the vFI.
Fig. 3: Comparison of male and female measures.
Fig. 4: Prediction of age and frailty from video features.
Fig. 5: Quantile regression modeling of vFI using generalized random forests.

Similar content being viewed by others

Data availability

Files containing the manual FI scores and vFI features for all mice in our dataset have been submitted as the source data for figures. Both files can also be found on the GitHub repository https://github.com/KumarLabJax/vFI-modeling and on the Zenodo repository https://zenodo.org/badge/latestdoi/412051716.

Code availability

Data collection was carried out using custom code for video collection and processing. Details and code can be found on the GitHub page: https://github.com/KumarLabJax/JABS-data-pipeline. Written in Python 3. Code and models will be available in Kumar Lab GitHub account (https://github.com/KumarLabJax and https://www.kumarlab.org/data/). The markdown file in the GitHub repository https://github.com/KumarLabJax/vFI-modeling contains requirements and details for reproducing results in the article and training your own models for vFI/Age prediction. Written in R. The code has also been released at (https://zenodo.org/badge/latestdoi/412051716). This code is also available as supplementary software files as ‘SupplementarySoftware 1.zip’. Code in Python 3 for generating engineered features can be found on https://github.com/KumarLabJax/vFI-features and has also been released at https://zenodo.org/badge/latestdoi/410956452.

References

  1. Mitnitski, A., Mogilner, A. & Rockwood, K. Accumulation of deficits as a proxy measure of aging. Sci. World J. 1, 323–336 (2001).

    Article  CAS  Google Scholar 

  2. Whitehead, J. C. et al. A clinical frailty index in aging mice: comparisons with frailty index data in humans. J. Gerontol. A Biomed. Sci. Med. Sci. 69, 621–632 (2014).

    Article  Google Scholar 

  3. Rockwood, K., Fox, R. A., Stolee, P., Robertson, D. & Beattie, B. L. Frailty in elderly people: an evolving concept. CMAJ 150, 489–495 (1994).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. Searle, S. D., Mitnitski, A., Gahbauer, E. A., Gill, T. M. & Rockwood, K. A standard procedure for creating a frailty index. BMC Geriatrics 8, 24 (2008).

    Article  Google Scholar 

  5. Schultz, M. B. et al. Age and life expectancy clocks based on machine-learning analysis of mouse frailty. Nat. Commun. 11, 1–12 (2020).

    Google Scholar 

  6. Kim, S., Myers, L., Wyckoff, J., Cherry, K. E. & Jazwinski, S. M. The frailty index outperforms DNA methylation age and its derivatives as an indicator of biological age. GeroSci. 39, 83–92 (2017).

    Article  CAS  Google Scholar 

  7. Kojima, G., Iliffe, S. & Walters, K. Frailty index as a predictor of mortality: a systematic review and meta-analysis. Age Ageing 47, 193–200 (2017).

    Article  Google Scholar 

  8. Parks, R. et al. A procedure for creating a frailty index based on deficit accumulation in aging mice. J Gerontol. A Biol. Sci. Med. Sci. 67, 217–227 (2012).

    Article  Google Scholar 

  9. Rockwood, K. et al. A frailty index based on deficit accumulation quantifies mortality risk in humans and in mice. Sci. Rep. 7, 43068 (2017).

    Article  CAS  Google Scholar 

  10. Kane, A. E., Ayaz, O., Ghimire, A., Feridooni, H. A. & Howlett, S. E. Implementation of the mouse frailty index. Canadian J Physiol. Pharmacol. 95, 1149–1155 (2017).

    Article  CAS  Google Scholar 

  11. Feridooni, H. A., Sun, M. H., Rockwood, K. & Howlett, S. E. Reliability of a frailty index based on the clinical assessment of health deficits in male C57BL/6J mice. J. Gerontol. A 70, 686–693 (2014).

  12. Kane, A. E. et al. Factors that impact on interrater reliability of the mouse clinical frailty index. J.Gerontol. A 70, 694–695 (2015).

  13. Walsh, R. N. & Cummins, R. A. The Open field test: a critical review. Psychol. Bull. 83, 482–504 (1976).

    Article  CAS  Google Scholar 

  14. Crawley, J. N. Whats Wrong With My Mouse: Behavioral Phenotyping of Transgenic and Knock-out Mice (Wiley, 2007).

  15. Ziegler, L., Sturman, O. & Bohacek, J. Big behavior: challenges and opportunities in a new era of deep behavior profiling. Neuropsychopharmacology 46, 33–44 (2020).

  16. Kumar, V. et al. Second-generation high-throughput forward genetic screen in mice to isolate subtle behavioral mutants. Proc. Natl Acad. Sci. USA 108, 15557–15564 (2011).

    Article  CAS  Google Scholar 

  17. LeCun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998).

    Article  Google Scholar 

  18. Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. Comm. ACM 60, 84–90 (2017).

    Article  Google Scholar 

  19. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

    Article  CAS  Google Scholar 

  20. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. in Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (2016).

  21. Schmidhuber, J. Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015).

  22. Raghu, M. & Schmidt, E. A survey of deep learning for scientific discovery. Preprint at arXiv https://arxiv.org/abs/2003.11755 (2020).

  23. Geuther, B. et al. Robust mouse tracking in complex environments using neural networks. Commun. Biol. 2, 124 (2019).

  24. Geuther, B. Q. et al. Action detection using a neural network elucidates the genetics of mouse grooming behavior. eLife 10, e63207 (2021).

    Article  CAS  Google Scholar 

  25. Sheppard, K. et al. Stride-level analysis of mouse open field behavior using deep-learning-based pose estimation. Cell Rep. 38, 110231 (2022).

  26. Mathis, A. et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 21, 1281–1289 (2018).

  27. Wiltschko, A. B. et al. Revealing the structure of pharmacobehavioral space through motion sequencing. Nat. Neurosci. 23, 1433–1443 (2020).

  28. Hsu, A. I. & Yttri, E. A. B-SOiD: An open source unsupervised algorithm for discovery of spontaneous behaviors. Nat. Commun. 12, 5188 (2021).

  29. Baumann, C., Kwak, D. & Thompson, L. Sex-specific components of frailty in C57BL/6 mice. Aging 11, 5206–5214 (2019).

  30. Sampathkumar, N. K. et al. Widespread sex dimorphism in aging and age-related diseases. Hum. Genet. 139, 333–356 (2020).

    Article  Google Scholar 

  31. Austad, S. N. in Handbook of the Biology of Aging 479–495 (Elsevier, 2011).

  32. Austad, S. N. & Fischer, K. E. Sex differences in lifespan. Cell Metab. 23, 1022–1033 (2016).

    Article  CAS  Google Scholar 

  33. Sukoff Rizzo, S. J. et al. Assessing healthspan and lifespan measures in aging mice: optimization of testing protocols, replicability, and rater reliability. Curr. Protoc. Mouse Biol. 8, e45 (2018).

    Article  Google Scholar 

  34. Hartigan, J. A. & Hartigan, P. M. The dip test of unimodality. Ann. Stat. 13, 70–84 (1985).

  35. Simpson, E. H. The interpretation of interaction in contingency tables. J. R. Stat. Soc. B Methodol. 13, 238–241 (1951).

    Google Scholar 

  36. Pappas, L. & Nagy, T. The translation of age-related body composition findings from rodents to humans. Eur. J. Clin. Nutr. 73, 172–178 (2018).

  37. Zhou, Y. et al. The detection of age groups by dynamic gait outcomes using machine-learning approaches. Sci. Rep. 10, 4426 (2020).

  38. Skiadopoulos, A., Moore, E. E., Sayles, H. R., Schmid, K. K. & Stergiou, N. Step width variability as a discriminator of age-related gait changes. J. Neuroeng. Rehab. 17, 41 (2020).

  39. Tarantini, S. et al. Age-related alterations in gait function in freely moving male C57BL/6 mice: translational relevance of decreased cadence and increased gait variability. J. Gerontol. A 74, 1417–1421 (2018).

  40. Bair, W.-N. et al. Of aging mice and men: gait speed decline is a translatable trait, with species-specific underlying properties. J. Gerontol. A 74, 1413–1416 (2019).

    Article  Google Scholar 

  41. Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. B Stat. Methodol. 67, 301–320 (2005).

    Article  Google Scholar 

  42. Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273–297 (1995).

    Article  Google Scholar 

  43. Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).

    Article  Google Scholar 

  44. Friedman, J. H. Greedy function approximation: a gradient-boosting machine. Ann. Stat. 29, 1189–1232 (2001).

  45. Zhang, H., Zimmerman, J., Nettleton, D. & Nordman, D. J. Random forest prediction intervals. Am. Stat. 74, 1–15 (2019).

  46. Doshi-Velez, F. & Kim, B. Towards a rigorous science of interpretable machine learning. Preprint at arXiv https://doi.org/10.48550/arXiv.1702.08608 (2017).

  47. Molnar, C. Interpretable Machine Learning (Lulu.com, 2020).

  48. Friedman, J. H. et al. Predictive learning via rule ensembles. Ann. Appl. Stat. 2, 916–954 (2008).

    Article  Google Scholar 

  49. Apley, D. W. & Zhu, J. Visualizing the effects of predictor variables in black box supervised learning models. J. R. Stat. Soc. B Stat. Methodol. 82, 1059–1086 (2020).

    Article  Google Scholar 

  50. Mizrahi-Lehrer, E., Cepeda-Valery, B. & Romero-Corral, A. in Handbook of Anthropometry: Physical Measures of Human Form in Health and Disease (ed Preedy, V. R.) 385–395 (Springer, 2012).

  51. Pappas, L. E. & Tim, R. N. The translation of age-related body composition findings from rodents to humans. Eur. J. Clin. Nutr. 73, 172–178 (2019).

    Article  Google Scholar 

  52. Huffman, D. M. & Barzilai, N. Role of visceral adipose tissue in aging. Biochim. Biophys. Acta 1790, 1117–1123 (2009).

    Article  CAS  Google Scholar 

  53. Gerbaix, M., Metz, L., Ringot, E. & Courteix, D. Visceral fat mass determination in rodent: Validation of dual-energy X-ray absorptiometry and anthropometric techniques in fat and lean rats. Lipids Health Dis. 9, 140 (2010).

    Article  Google Scholar 

  54. Imagama, S. et al. Back muscle strength and spinal mobility are predictors of quality of life in middle-aged and elderly males. Eur. Spine J. 20, 954–961 (2011).

    Article  Google Scholar 

  55. Kane, A., Keller, K. M., Heinze-Milne, S. D., Grandy, S. & Howlett, S. A murine frailty index based on clinical and laboratory measurements: links between frailty and pro-inflammatory cytokines differ in a sex-specific manner. J. Gerontol. A 74, 275–282 (2019).

    Article  CAS  Google Scholar 

  56. Beane, G. et al. Video based phenotyping platform for the laboratory mouse. Preprint at bioRxiv https://doi.org/10.1101/2022.01.13.476229 (2022).

  57. Pereira, T. D., Shaevitz, J. W. & Murthy, M. Quantifying behavior to understand the brain. Nat. Neurosci. 23, 1537–1549 (2020).

  58. Mathis, A., Schneider, S., Lauer, J. & Mathis, M. W. A primer on motion capture with deep learning: principles, pitfalls, and perspectives. Neuron 108, 44–65 (2020).

    Article  CAS  Google Scholar 

  59. Singh, P. P., Demmitt, B. A., Nath, R. D. & Brunet, A. The genetics of aging: a vertebrate perspective. Cell 177, 200–220 (2019).

    Article  CAS  Google Scholar 

  60. Crainiceanu, C. M. & Ruppert, D. Likelihood ratio tests in linear mixed models with one variance component. J. R. Stat. Soc. B Stat. Methodol. 66, 165–185 (2004).

    Article  Google Scholar 

  61. Agresti, A. Categorical Data Analysis (John Wiley & Sons, 2003).

  62. Bates, D., Maechler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48 (2015).

    Article  Google Scholar 

  63. Kenward, M. G. & Roger, J. H. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics 53, 983–997 (1997).

  64. Fai, A. H.-T. & Cornelius, P. L. Approximate F-tests of multiple degree of freedom hypotheses in generalized least squares analyses of unbalanced split-plot experiments. J. Stat. Comput. Simul. 54, 363–378 (1996).

  65. McCullagh, P. Regression models for ordinal data. J. R. Stat. Soc. B Methodol. 42, 109–127 (1980).

    Google Scholar 

  66. Meinshausen, N. Quantile regression forests. J. Mach. Learn. Res. 7, 983–999 (2006).

    Google Scholar 

  67. Athey, S. et al. Generalized random forests. Ann. Stat. 47, 1148–1178 (2019).

    Article  Google Scholar 

Download references

Acknowledgements

We thank Kumar Laboratory members, S. Deats, T. Sproule, B. Geuther and K. Sheppard for behavioral testing, data processing and helpful advice. We thank Shock Center and Churchill Laboratory members H. Donato, G. Garland, M. Leland and L. Robinson for frailty indexing and coordinating. We thank T. Helenius for editing. We thank the members of the JAX Information Technology team for infrastructure support. This work was funded by The Jackson Laboratory Directors Innovation Fund, National Institute of Health, DA041668 (V.K.) and DA048634 (V.K.) and Nathan Shock Centers of Excellence in the Basic Biology of Aging, AG38070 (G.A.C.).

Author information

Authors and Affiliations

Authors

Contributions

G.A.C. contributed the mice and data collection. L.E.H. and G.S.S. both contributed to the analysis of the dataset. L.E.H., G.S.S. and V.K. contributed to the writing of the article. All authors discussed results and contributed to the direction of the article.

Corresponding authors

Correspondence to Gary A. Churchill or Vivek Kumar.

Ethics declarations

Competing interests

The Jackson Laboratory has filed a provisional patent on the methods described in this article.

Peer review

Peer review information

Nature Aging thanks Eric Yttri, Johannes Bohacek and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Dr. Marie Anne O'Donnell, Dr. Manlio Vinciguerra and Dr. Sebastien Thuault, in collaboration with the Nature Aging team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Estimation of the scorer effect in clinical FI items.

A, The effect of tester varies across FI items. B, The estimated random effect across 4 scorers in the data set.

Source data

Extended Data Fig. 2 Detailed modeling analysis.

A, The distribution of age across 643 data points (533 mice). The distribution of manual FIadj scores across 643 data points (533 mice). B, To determine the contributions of frailty parameters in predicting Age, we calculated the feature importance of all frailty parameters. We discover that gait disorders, kyphosis and piloerection have the highest contributions. C, The random forest regression model performed better than other models with the lowest root-mean-squared error (RMSE) (n = 50 independent train-test splits, p < 2.2e − 16, F3,147 = 59.53) and highest R2 (p < 2.2e − 16, F3,147 = 58.14) when compared using repeated-measures ANOVA. D, The vFRIGHT model performed better than the FRIGHT model with a lower RMSE (n = 50 independent train-test splits, RMSEvFRIGHT = 17.97 ± 1.44, RMSEFRIGHT = 20.62 ± 4.78, p < 6.1e − 7, F1,49 = 32.84) and higher R2 (RMSEvFRIGHT =0.78 ± 0.04, RMSEFRIGHT = 0.76 ± 0.07, p < 2.1e − 8, F1,49 = 44.54) when compared using repeated-measures ANOVA. E, The random forest regression model for predicting FI score on unseen future data performed better than all other models, with a lowest root-mean-squared error (RMSE) (n = 50 independent train-test splits, p < 8.3e − 14, F3,147 = 26.62) and highest R2 (p < 4.7e − 14, F3,147 = 27.2). F, The plot shows the counts distribution (0 - green, 0.5 - orange, 1 - purple) for individual frailty parameters— for many parameters such as Nasal discharge, Rectal prolapse, Vaginal uterine and Diarrhea, the proportion of 0 counts is 1 (p0 = 1). Similarly, Dermatitis, Cataracts, Eye discharge swelling, Microphthalmia, Corneal opacity, Tail stiffening and Malocclusions have p0 > 0.95. G, The residuals versus the index and predicted versus true for training (Column 1; residual standard error = 8.5, difference in slopes (black vs gray) = 0.11) and test sets (Column 2; residual standard error = 15.87, difference in slopes (black vs gray) = 0.30) for the model that predicts Age using frailty index items for both training and test data. H, I, Out-of-bag (OOB) error based 95% prediction intervals (PIs) (gray lines) quantifying uncertainty in point estimates/predictions (gray dots). There is one interval per test mouse (n = 107 unique mice, the test data contains some repeats of the same mice tested at different ages) and approximately 95% of the PI intervals contain the correct Age (red dots) and FI scores (blue dots). We ordered the x-axis (Test set index) in ascending order (from left to right) of the actual age/FI. The average PI width for all test mouse’s predicted FI score is 0.18 ± 0.04 (resp. 71.96 ± 18.52 for the predicted Age), while the PI lengths range from 0.08 to 0.29 (resp. 28 to 113 for Age). n (C, D and E), the lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles) respectively, the line in the middle corresponds to the median, the upper (lower) hinge extends from the upper (lower) hinge to the largest (smallest) value not bigger (smaller) than 1.5 × IQR where IQR is the interquartile range.

Source data

Extended Data Fig. 3 Correlation between video metrics.

A, Correlation between average/mean (x-axis) and median (y-axis) video gait metrics. The diagonal line corresponds to maximum correlation i.e. 1. B, Correlation between inter-quartile range (IQR, x-axis) and standard deviation (Stdev, y-axis) video gait metrics. The diagonal line corresponds to maximum correlation i.e. 1. A tight wrap of points around the diagonal line indicates a high correlation between mean and median or IQR and standard deviation for the respective metric.

Source data

Extended Data Fig. 4 Test for Simpson’s paradox.

A, Simpson (1951) showed that the statistical relationship observed in the population could be reversed within all of the subgroups that make up that population, leading to erroneous conclusions drawn from the population data. To test for the manifestation of Simpson’s paradox in our data, we split the bimodal Age distribution into two separate unimodal distributions (clusters), that is, less than 70 weeks old (L70, red) versus more than 70 weeks old (U70, blue). Next, we plotted the dependent variable (frailty) against each of the independent variables/features in our data and fit a simple linear regression model to each subgroup separately (solid red and blue lines) as well as to the aggregate data (black dotted line). B, We quantified the correlations by measuring the slope of the linear fits of the features (Y) on Age (X). We computed the slopes for L70, U70 and overall (All), then plotted the slopes for features in decreasing order of their relevance to the model (where we predict Age from these features). We went further and performed one-way ANOVA to test for differences in slopes between L70 and U70 sub-groups and the overall data (one-way ANOVA, F2,141 = 1.162, p > 0.32). Next, we performed a false discovery rate adjusted post hoc pairwise comparisons using the t-test. We found no significant differences in the comparisons (L70 versus U70, p = 0.38, L70 versus All, p = 0.77 and U70 versus All, p = 0.38). We found that Simpson’s paradox does not manifest in any of the top fifteen features in our data.

Source data

Extended Data Fig. 5 Further experiments to test model performance and parameters.

A, We compare the performance of different feature sets, 1) age alone, 2) video and 3) age + video, in predicting frailty across n = 50 independent train-test splits. We use age alone as a feature in a linear (AgeL) and a generalized additive non-linear model (AgeG). Although we didn’t notice a clear improvement of the random forest model (VideoRF) using video features over a vFI prediction based on age alone, a clear improvement in prediction performance is seen for the model (AllRF), which contains video features + age with lowest MSE (p < 2.2e − 16, F3,147 = 213.79, LMM post hoc pairwise comparison with AgeG, t147 = −12.21, FDR-adjusted p < .0001), lowest RMSE (p < 2.2e − 16, F3,147 = 172.88, LMM post hoc pairwise comparison with AgeG, t147 = −14.12, FDR-adjusted p < .0001) and highest R2 (p < 2.2e − 16, F3,147 = 171.12, LMM post hoc pairwise comparison with AgeG, t147 = 14.07, FDR-adjusted p < .0001). This shows that video features add important information pertaining to frailty that age alone does not. B, We picked animals whose ages and FI scores had an inverse relationship, that is, younger animals with higher FI scores and older animals with lower FI scores. We formed 5 test sets (n = 43, 38, 45, 42, 20) containing animals with these criteria and trained the random forest (RF) model on the remaining mice. The model using only video features (VideoRF) does better than all other models for these mice with lowest MSE (p < 1.6e − 08, F3,12 = 91.07, LMM post hoc pairwise comparison with AgeG, t12 = 13.60, FDR-adjusted p < .0001), lowest RMSE (p < 1.6e − 08, F3,12 = 93.88, LMM post hoc pairwise comparison with AgeG, t12 = 14.15, FDR-adjusted p < .0001) and highest R2 (p < 1.31e − 08, F3,12 = 94.32, LMM post hoc pairwise comparison with AgeG, t12 = 14.10, FDR-adjusted p < .0001). C, We further investigate the difference between Age and vFI predictors in terms of feature importance. Features lying along the diagonal are important for both Age and vFI predictions. D, Predicting FI score from video features extracted from videos of shorter durations. We used video features generated from videos with shorter durations (first 5 and 20 minutes) to investigate the loss in accuracy in predicting age and FI score. We used the random forest model trained with features generated from 60-minute videos as a baseline model for comparison. We found a diminished loss in accuracy using shorter videos. The features associated with 60-minute videos had the best accuracy for vFI prediction (LMM where ‘simulation’ is the random effect, nsim = 50; lowest MAE, F2,98 = 178.39, p < 2.2e − 16; lowest RMSE, F2,98 = 156.93, p < 2.2e − 16); highest R2 (p < 2.2e − 16, F2,98 = 297.3). We observed a significant drop in performance accuracy when the open field test length is reduced from 60 to 20-minute video (LMM with post hoc pairwise comparisons - MAE, t98 = 14.82, FDR-adjusted p < 0.0001; RMSE, t98 = 13.69, FDR-adjusted p < 0.0001; R2, t98 = −19.22, FDR-adjusted p < 0.0001). E, To see how much training data is realistically needed, we performed a simulation study (n = 50 independent train-test splits) where we allocated different percentage of total data to training. As expected, there is a general downward (upward) trend in MAE, RMSE (R2 with an increasing percentage of data allocated to training set. Indeed, a smaller training set (< 80% training) can reach a similar training performance. In A, B, E and D, the lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles) respectively, the line in the middle corresponds to the median, the upper (lower) hinge extends from the upper (lower) hinge to the largest (smallest) value not bigger (smaller than 1.5 × IQR where IQR is the interquartile range.

Source data

Supplementary information

Supplementary Information

Supplementary Tables 1–5

Reporting Summary

Supplementary Video 1

Young mouse (8 weeks old) sample open-field video.

Supplementary Video 2

Old mouse (134 weeks old). Sample of open-field video of a mouse aged 134 weeks.

Supplementary Video 3

Flexibility metrics. At each frame, three points shown in yellow are estimated: base of head (A), mid-back (B) and base of tail (C). At each frame, the distance between points A and C (dAC, shown in red), distance between point B and the midpoint of line AC (dB, shown in green) and the angle formed by the points ABC (aABC, shown in blue) are calculated.

Supplementary Video 4

Rearing metrics. Rearing is called when the nose of the mice (marked with a red dot) crosses the perimeter of the open field (yellow line). Rearing is indicated by the presence of the red square in the upper corner of video.

Supplementary Table 1

Details the manual FI scoring.

Supplementary Software 1

ReadMe.txt: Contains this description. vFI_expts.R: Contains functions that define the steps for carrying out synthetic experiments I -V. The steps include performing train–test splits, fitting machine-learning models, using the fitted models to make predictions and reporting the performance measures on the test sets. vFI_Figures.R: Contains the script for recreating figures/plots in the paper. This file requires the synthetic experiments’ results, stored as CSV files. vFI_functions.R: This file contains all the main functions for carrying out all analyses in the paper. vFI_utils.R: It contains some utility functions, such as creating storage matrices for results from different experiments and saving them. vFI.R: The main file uses the previous files to carry out the analyses in the paper. It creates matrices containing the results and saves them as CSVs.

Source data

Source Data Fig. 1

Statistical source data.

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Fig. 4

Statistical source data.

Source Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 1

Statistical source data.

Source Data Extended Data Fig. 2

Statistical source data.

Source Data Extended Data Fig. 3

Statistical source data.

Source Data Extended Data Fig. 4

Statistical source data.

Source Data Extended Data Fig. 5

Statistical source data.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hession, L.E., Sabnis, G.S., Churchill, G.A. et al. A machine-vision-based frailty index for mice. Nat Aging 2, 756–766 (2022). https://doi.org/10.1038/s43587-022-00266-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s43587-022-00266-0

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing