A machine-vision-based frailty index for mice

Hession, Leinani E.; Sabnis, Gautam S.; Churchill, Gary A.; Kumar, Vivek

doi:10.1038/s43587-022-00266-0

Technical Report
Published: 16 August 2022

A machine-vision-based frailty index for mice

Nature Aging volume 2, pages 756–766 (2022)Cite this article

1237 Accesses
6 Citations
29 Altmetric
Metrics details

Subjects

Abstract

Heterogeneity in biological aging manifests itself in health status and mortality. Frailty indices (FIs) capture health status in humans and model organisms. To accelerate our understanding of biological aging and carry out scalable interventional studies, high-throughput approaches are necessary. Here we introduce a machine-learning-based visual FI for mice that operates on video data from an open-field assay. We use machine vision to extract morphometric, gait and other behavioral features that correlate with FI score and age. We use these features to train a regression model that accurately predicts the normalized FI score within 0.04 ± 0.002 (mean absolute error). Unnormalized, this error is 1.08 ± 0.05, which is comparable to one FI item being mis-scored by 1 point or two FI items mis-scored by 0.5 points. This visual FI provides increased reproducibility and scalability that will enable large-scale mechanistic and interventional studies of aging in mice.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Approach overview to build a visual frailty index.**

**Fig. 2: Sample features used in the vFI.**

**Fig. 3: Comparison of male and female measures.**

**Fig. 4: Prediction of age and frailty from video features.**

**Fig. 5: Quantile regression modeling of vFI using generalized random forests.**

Self-supervised learning for human activity recognition using 700,000 person-days of wearable data

Article Open access 12 April 2024

Hang Yuan, Shing Chan, … Aiden Doherty

Collective intelligence: A unifying concept for integrating biology across scales and substrates

Article Open access 28 March 2024

Patrick McMillen & Michael Levin

The effects of genetic and modifiable risk factors on brain regions vulnerable to ageing and disease

Article Open access 27 March 2024

Jordi Manuello, Joosung Min, … Gwenaëlle Douaud

Data availability

Files containing the manual FI scores and vFI features for all mice in our dataset have been submitted as the source data for figures. Both files can also be found on the GitHub repository https://github.com/KumarLabJax/vFI-modeling and on the Zenodo repository https://zenodo.org/badge/latestdoi/412051716.

Code availability

Data collection was carried out using custom code for video collection and processing. Details and code can be found on the GitHub page: https://github.com/KumarLabJax/JABS-data-pipeline. Written in Python 3. Code and models will be available in Kumar Lab GitHub account (https://github.com/KumarLabJax and https://www.kumarlab.org/data/). The markdown file in the GitHub repository https://github.com/KumarLabJax/vFI-modeling contains requirements and details for reproducing results in the article and training your own models for vFI/Age prediction. Written in R. The code has also been released at (https://zenodo.org/badge/latestdoi/412051716). This code is also available as supplementary software files as ‘SupplementarySoftware 1.zip’. Code in Python 3 for generating engineered features can be found on https://github.com/KumarLabJax/vFI-features and has also been released at https://zenodo.org/badge/latestdoi/410956452.

References

Mitnitski, A., Mogilner, A. & Rockwood, K. Accumulation of deficits as a proxy measure of aging. Sci. World J. 1, 323–336 (2001).
Article CAS Google Scholar
Whitehead, J. C. et al. A clinical frailty index in aging mice: comparisons with frailty index data in humans. J. Gerontol. A Biomed. Sci. Med. Sci. 69, 621–632 (2014).
Article Google Scholar
Rockwood, K., Fox, R. A., Stolee, P., Robertson, D. & Beattie, B. L. Frailty in elderly people: an evolving concept. CMAJ 150, 489–495 (1994).
CAS PubMed PubMed Central Google Scholar
Searle, S. D., Mitnitski, A., Gahbauer, E. A., Gill, T. M. & Rockwood, K. A standard procedure for creating a frailty index. BMC Geriatrics 8, 24 (2008).
Article Google Scholar
Schultz, M. B. et al. Age and life expectancy clocks based on machine-learning analysis of mouse frailty. Nat. Commun. 11, 1–12 (2020).
Google Scholar
Kim, S., Myers, L., Wyckoff, J., Cherry, K. E. & Jazwinski, S. M. The frailty index outperforms DNA methylation age and its derivatives as an indicator of biological age. GeroSci. 39, 83–92 (2017).
Article CAS Google Scholar
Kojima, G., Iliffe, S. & Walters, K. Frailty index as a predictor of mortality: a systematic review and meta-analysis. Age Ageing 47, 193–200 (2017).
Article Google Scholar
Parks, R. et al. A procedure for creating a frailty index based on deficit accumulation in aging mice. J Gerontol. A Biol. Sci. Med. Sci. 67, 217–227 (2012).
Article Google Scholar
Rockwood, K. et al. A frailty index based on deficit accumulation quantifies mortality risk in humans and in mice. Sci. Rep. 7, 43068 (2017).
Article CAS Google Scholar
Kane, A. E., Ayaz, O., Ghimire, A., Feridooni, H. A. & Howlett, S. E. Implementation of the mouse frailty index. Canadian J Physiol. Pharmacol. 95, 1149–1155 (2017).
Article CAS Google Scholar
Feridooni, H. A., Sun, M. H., Rockwood, K. & Howlett, S. E. Reliability of a frailty index based on the clinical assessment of health deficits in male C57BL/6J mice. J. Gerontol. A 70, 686–693 (2014).
Kane, A. E. et al. Factors that impact on interrater reliability of the mouse clinical frailty index. J.Gerontol. A 70, 694–695 (2015).
Walsh, R. N. & Cummins, R. A. The Open field test: a critical review. Psychol. Bull. 83, 482–504 (1976).
Article CAS Google Scholar
Crawley, J. N. Whats Wrong With My Mouse: Behavioral Phenotyping of Transgenic and Knock-out Mice (Wiley, 2007).
Ziegler, L., Sturman, O. & Bohacek, J. Big behavior: challenges and opportunities in a new era of deep behavior profiling. Neuropsychopharmacology 46, 33–44 (2020).
Kumar, V. et al. Second-generation high-throughput forward genetic screen in mice to isolate subtle behavioral mutants. Proc. Natl Acad. Sci. USA 108, 15557–15564 (2011).
Article CAS Google Scholar
LeCun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998).
Article Google Scholar
Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. Comm. ACM 60, 84–90 (2017).
Article Google Scholar
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article CAS Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. in Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (2016).
Schmidhuber, J. Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015).
Raghu, M. & Schmidt, E. A survey of deep learning for scientific discovery. Preprint at arXiv https://arxiv.org/abs/2003.11755 (2020).
Geuther, B. et al. Robust mouse tracking in complex environments using neural networks. Commun. Biol. 2, 124 (2019).
Geuther, B. Q. et al. Action detection using a neural network elucidates the genetics of mouse grooming behavior. eLife 10, e63207 (2021).
Article CAS Google Scholar
Sheppard, K. et al. Stride-level analysis of mouse open field behavior using deep-learning-based pose estimation. Cell Rep. 38, 110231 (2022).
Mathis, A. et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 21, 1281–1289 (2018).
Wiltschko, A. B. et al. Revealing the structure of pharmacobehavioral space through motion sequencing. Nat. Neurosci. 23, 1433–1443 (2020).
Hsu, A. I. & Yttri, E. A. B-SOiD: An open source unsupervised algorithm for discovery of spontaneous behaviors. Nat. Commun. 12, 5188 (2021).
Baumann, C., Kwak, D. & Thompson, L. Sex-specific components of frailty in C57BL/6 mice. Aging 11, 5206–5214 (2019).
Sampathkumar, N. K. et al. Widespread sex dimorphism in aging and age-related diseases. Hum. Genet. 139, 333–356 (2020).
Article Google Scholar
Austad, S. N. in Handbook of the Biology of Aging 479–495 (Elsevier, 2011).
Austad, S. N. & Fischer, K. E. Sex differences in lifespan. Cell Metab. 23, 1022–1033 (2016).
Article CAS Google Scholar
Sukoff Rizzo, S. J. et al. Assessing healthspan and lifespan measures in aging mice: optimization of testing protocols, replicability, and rater reliability. Curr. Protoc. Mouse Biol. 8, e45 (2018).
Article Google Scholar
Hartigan, J. A. & Hartigan, P. M. The dip test of unimodality. Ann. Stat. 13, 70–84 (1985).
Simpson, E. H. The interpretation of interaction in contingency tables. J. R. Stat. Soc. B Methodol. 13, 238–241 (1951).
Google Scholar
Pappas, L. & Nagy, T. The translation of age-related body composition findings from rodents to humans. Eur. J. Clin. Nutr. 73, 172–178 (2018).
Zhou, Y. et al. The detection of age groups by dynamic gait outcomes using machine-learning approaches. Sci. Rep. 10, 4426 (2020).
Skiadopoulos, A., Moore, E. E., Sayles, H. R., Schmid, K. K. & Stergiou, N. Step width variability as a discriminator of age-related gait changes. J. Neuroeng. Rehab. 17, 41 (2020).
Tarantini, S. et al. Age-related alterations in gait function in freely moving male C57BL/6 mice: translational relevance of decreased cadence and increased gait variability. J. Gerontol. A 74, 1417–1421 (2018).
Bair, W.-N. et al. Of aging mice and men: gait speed decline is a translatable trait, with species-specific underlying properties. J. Gerontol. A 74, 1413–1416 (2019).
Article Google Scholar
Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. B Stat. Methodol. 67, 301–320 (2005).
Article Google Scholar
Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273–297 (1995).
Article Google Scholar
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Article Google Scholar
Friedman, J. H. Greedy function approximation: a gradient-boosting machine. Ann. Stat. 29, 1189–1232 (2001).
Zhang, H., Zimmerman, J., Nettleton, D. & Nordman, D. J. Random forest prediction intervals. Am. Stat. 74, 1–15 (2019).
Doshi-Velez, F. & Kim, B. Towards a rigorous science of interpretable machine learning. Preprint at arXiv https://doi.org/10.48550/arXiv.1702.08608 (2017).
Molnar, C. Interpretable Machine Learning (Lulu.com, 2020).
Friedman, J. H. et al. Predictive learning via rule ensembles. Ann. Appl. Stat. 2, 916–954 (2008).
Article Google Scholar
Apley, D. W. & Zhu, J. Visualizing the effects of predictor variables in black box supervised learning models. J. R. Stat. Soc. B Stat. Methodol. 82, 1059–1086 (2020).
Article Google Scholar
Mizrahi-Lehrer, E., Cepeda-Valery, B. & Romero-Corral, A. in Handbook of Anthropometry: Physical Measures of Human Form in Health and Disease (ed Preedy, V. R.) 385–395 (Springer, 2012).
Pappas, L. E. & Tim, R. N. The translation of age-related body composition findings from rodents to humans. Eur. J. Clin. Nutr. 73, 172–178 (2019).
Article Google Scholar
Huffman, D. M. & Barzilai, N. Role of visceral adipose tissue in aging. Biochim. Biophys. Acta 1790, 1117–1123 (2009).
Article CAS Google Scholar
Gerbaix, M., Metz, L., Ringot, E. & Courteix, D. Visceral fat mass determination in rodent: Validation of dual-energy X-ray absorptiometry and anthropometric techniques in fat and lean rats. Lipids Health Dis. 9, 140 (2010).
Article Google Scholar
Imagama, S. et al. Back muscle strength and spinal mobility are predictors of quality of life in middle-aged and elderly males. Eur. Spine J. 20, 954–961 (2011).
Article Google Scholar
Kane, A., Keller, K. M., Heinze-Milne, S. D., Grandy, S. & Howlett, S. A murine frailty index based on clinical and laboratory measurements: links between frailty and pro-inflammatory cytokines differ in a sex-specific manner. J. Gerontol. A 74, 275–282 (2019).
Article CAS Google Scholar
Beane, G. et al. Video based phenotyping platform for the laboratory mouse. Preprint at bioRxiv https://doi.org/10.1101/2022.01.13.476229 (2022).
Pereira, T. D., Shaevitz, J. W. & Murthy, M. Quantifying behavior to understand the brain. Nat. Neurosci. 23, 1537–1549 (2020).
Mathis, A., Schneider, S., Lauer, J. & Mathis, M. W. A primer on motion capture with deep learning: principles, pitfalls, and perspectives. Neuron 108, 44–65 (2020).
Article CAS Google Scholar
Singh, P. P., Demmitt, B. A., Nath, R. D. & Brunet, A. The genetics of aging: a vertebrate perspective. Cell 177, 200–220 (2019).
Article CAS Google Scholar
Crainiceanu, C. M. & Ruppert, D. Likelihood ratio tests in linear mixed models with one variance component. J. R. Stat. Soc. B Stat. Methodol. 66, 165–185 (2004).
Article Google Scholar
Agresti, A. Categorical Data Analysis (John Wiley & Sons, 2003).
Bates, D., Maechler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48 (2015).
Article Google Scholar
Kenward, M. G. & Roger, J. H. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics 53, 983–997 (1997).
Fai, A. H.-T. & Cornelius, P. L. Approximate F-tests of multiple degree of freedom hypotheses in generalized least squares analyses of unbalanced split-plot experiments. J. Stat. Comput. Simul. 54, 363–378 (1996).
McCullagh, P. Regression models for ordinal data. J. R. Stat. Soc. B Methodol. 42, 109–127 (1980).
Google Scholar
Meinshausen, N. Quantile regression forests. J. Mach. Learn. Res. 7, 983–999 (2006).
Google Scholar
Athey, S. et al. Generalized random forests. Ann. Stat. 47, 1148–1178 (2019).
Article Google Scholar

Download references

Acknowledgements

We thank Kumar Laboratory members, S. Deats, T. Sproule, B. Geuther and K. Sheppard for behavioral testing, data processing and helpful advice. We thank Shock Center and Churchill Laboratory members H. Donato, G. Garland, M. Leland and L. Robinson for frailty indexing and coordinating. We thank T. Helenius for editing. We thank the members of the JAX Information Technology team for infrastructure support. This work was funded by The Jackson Laboratory Directors Innovation Fund, National Institute of Health, DA041668 (V.K.) and DA048634 (V.K.) and Nathan Shock Centers of Excellence in the Basic Biology of Aging, AG38070 (G.A.C.).

Author information

These authors contributed equally: Leinani E. Hession, Gautam S. Sabnis.

Authors and Affiliations

The Jackson Laboratory, Bar Harbor, ME, USA
Leinani E. Hession, Gautam S. Sabnis, Gary A. Churchill & Vivek Kumar

Authors

Leinani E. Hession
View author publications
You can also search for this author in PubMed Google Scholar
Gautam S. Sabnis
View author publications
You can also search for this author in PubMed Google Scholar
Gary A. Churchill
View author publications
You can also search for this author in PubMed Google Scholar
Vivek Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

G.A.C. contributed the mice and data collection. L.E.H. and G.S.S. both contributed to the analysis of the dataset. L.E.H., G.S.S. and V.K. contributed to the writing of the article. All authors discussed results and contributed to the direction of the article.

Corresponding authors

Correspondence to Gary A. Churchill or Vivek Kumar.

Ethics declarations

Competing interests

The Jackson Laboratory has filed a provisional patent on the methods described in this article.

Peer review

Peer review information

Nature Aging thanks Eric Yttri, Johannes Bohacek and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Dr. Marie Anne O'Donnell, Dr. Manlio Vinciguerra and Dr. Sebastien Thuault, in collaboration with the Nature Aging team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Estimation of the scorer effect in clinical FI items.

A, The effect of tester varies across FI items. B, The estimated random effect across 4 scorers in the data set.

Source data

Extended Data Fig. 2 Detailed modeling analysis.

A, The distribution of age across 643 data points (533 mice). The distribution of manual FI_adj scores across 643 data points (533 mice). B, To determine the contributions of frailty parameters in predicting Age, we calculated the feature importance of all frailty parameters. We discover that gait disorders, kyphosis and piloerection have the highest contributions. C, The random forest regression model performed better than other models with the lowest root-mean-squared error (RMSE) (n = 50 independent train-test splits, p < 2.2e − 16, F3,147 = 59.53) and highest R2 (p < 2.2e − 16, F3,147 = 58.14) when compared using repeated-measures ANOVA. D, The vFRIGHT model performed better than the FRIGHT model with a lower RMSE (n = 50 independent train-test splits, RMSE_vFRIGHT = 17.97 ± 1.44, RMSE_FRIGHT = 20.62 ± 4.78, p < 6.1e − 7, F1,49 = 32.84) and higher R2 (RMSE_vFRIGHT =0.78 ± 0.04, RMSE_FRIGHT = 0.76 ± 0.07, p < 2.1e − 8, F1,49 = 44.54) when compared using repeated-measures ANOVA. E, The random forest regression model for predicting FI score on unseen future data performed better than all other models, with a lowest root-mean-squared error (RMSE) (n = 50 independent train-test splits, p < 8.3e − 14, F3,147 = 26.62) and highest R2 (p < 4.7e − 14, F3,147 = 27.2). F, The plot shows the counts distribution (0 - green, 0.5 - orange, 1 - purple) for individual frailty parameters— for many parameters such as Nasal discharge, Rectal prolapse, Vaginal uterine and Diarrhea, the proportion of 0 counts is 1 (p0 = 1). Similarly, Dermatitis, Cataracts, Eye discharge swelling, Microphthalmia, Corneal opacity, Tail stiffening and Malocclusions have p0 > 0.95. G, The residuals versus the index and predicted versus true for training (Column 1; residual standard error = 8.5, difference in slopes (black vs gray) = 0.11) and test sets (Column 2; residual standard error = 15.87, difference in slopes (black vs gray) = 0.30) for the model that predicts Age using frailty index items for both training and test data. H, I, Out-of-bag (OOB) error based 95% prediction intervals (PIs) (gray lines) quantifying uncertainty in point estimates/predictions (gray dots). There is one interval per test mouse (n = 107 unique mice, the test data contains some repeats of the same mice tested at different ages) and approximately 95% of the PI intervals contain the correct Age (red dots) and FI scores (blue dots). We ordered the x-axis (Test set index) in ascending order (from left to right) of the actual age/FI. The average PI width for all test mouse’s predicted FI score is 0.18 ± 0.04 (resp. 71.96 ± 18.52 for the predicted Age), while the PI lengths range from 0.08 to 0.29 (resp. 28 to 113 for Age). n (C, D and E), the lower and upper hinges correspond to the first and third quartiles (the 25th and 75^th percentiles) respectively, the line in the middle corresponds to the median, the upper (lower) hinge extends from the upper (lower) hinge to the largest (smallest) value not bigger (smaller) than 1.5 × IQR where IQR is the interquartile range.

Source data

Extended Data Fig. 3 Correlation between video metrics.

A, Correlation between average/mean (x-axis) and median (y-axis) video gait metrics. The diagonal line corresponds to maximum correlation i.e. 1. B, Correlation between inter-quartile range (IQR, x-axis) and standard deviation (Stdev, y-axis) video gait metrics. The diagonal line corresponds to maximum correlation i.e. 1. A tight wrap of points around the diagonal line indicates a high correlation between mean and median or IQR and standard deviation for the respective metric.

Source data

Extended Data Fig. 4 Test for Simpson’s paradox.

A, Simpson (1951) showed that the statistical relationship observed in the population could be reversed within all of the subgroups that make up that population, leading to erroneous conclusions drawn from the population data. To test for the manifestation of Simpson’s paradox in our data, we split the bimodal Age distribution into two separate unimodal distributions (clusters), that is, less than 70 weeks old (L70, red) versus more than 70 weeks old (U70, blue). Next, we plotted the dependent variable (frailty) against each of the independent variables/features in our data and fit a simple linear regression model to each subgroup separately (solid red and blue lines) as well as to the aggregate data (black dotted line). B, We quantified the correlations by measuring the slope of the linear fits of the features (Y) on Age (X). We computed the slopes for L70, U70 and overall (All), then plotted the slopes for features in decreasing order of their relevance to the model (where we predict Age from these features). We went further and performed one-way ANOVA to test for differences in slopes between L70 and U70 sub-groups and the overall data (one-way ANOVA, F_2,141 = 1.162, p > 0.32). Next, we performed a false discovery rate adjusted post hoc pairwise comparisons using the t-test. We found no significant differences in the comparisons (L70 versus U70, p = 0.38, L70 versus All, p = 0.77 and U70 versus All, p = 0.38). We found that Simpson’s paradox does not manifest in any of the top fifteen features in our data.

Source data

Extended Data Fig. 5 Further experiments to test model performance and parameters.

A, We compare the performance of different feature sets, 1) age alone, 2) video and 3) age + video, in predicting frailty across n = 50 independent train-test splits. We use age alone as a feature in a linear (Age_L) and a generalized additive non-linear model (Age_G). Although we didn’t notice a clear improvement of the random forest model (Video_RF) using video features over a vFI prediction based on age alone, a clear improvement in prediction performance is seen for the model (All_RF), which contains video features + age with lowest MSE (p < 2.2e − 16, F_3,147 = 213.79, LMM post hoc pairwise comparison with Age_G, t₁₄₇ = −12.21, FDR-adjusted p < .0001), lowest RMSE (p < 2.2e − 16, F_3,147 = 172.88, LMM post hoc pairwise comparison with Age_G, t₁₄₇ = −14.12, FDR-adjusted p < .0001) and highest R2 (p < 2.2e − 16, F_3,147 = 171.12, LMM post hoc pairwise comparison with Age_G, t₁₄₇ = 14.07, FDR-adjusted p < .0001). This shows that video features add important information pertaining to frailty that age alone does not. B, We picked animals whose ages and FI scores had an inverse relationship, that is, younger animals with higher FI scores and older animals with lower FI scores. We formed 5 test sets (n = 43, 38, 45, 42, 20) containing animals with these criteria and trained the random forest (RF) model on the remaining mice. The model using only video features (Video_RF) does better than all other models for these mice with lowest MSE (p < 1.6e − 08, F_3,12 = 91.07, LMM post hoc pairwise comparison with AgeG, t₁₂ = 13.60, FDR-adjusted p < .0001), lowest RMSE (p < 1.6e − 08, F_3,12 = 93.88, LMM post hoc pairwise comparison with AgeG, t₁₂ = 14.15, FDR-adjusted p < .0001) and highest R2 (p < 1.31e − 08, F_3,12 = 94.32, LMM post hoc pairwise comparison with AgeG, t₁₂ = 14.10, FDR-adjusted p < .0001). C, We further investigate the difference between Age and vFI predictors in terms of feature importance. Features lying along the diagonal are important for both Age and vFI predictions. D, Predicting FI score from video features extracted from videos of shorter durations. We used video features generated from videos with shorter durations (first 5 and 20 minutes) to investigate the loss in accuracy in predicting age and FI score. We used the random forest model trained with features generated from 60-minute videos as a baseline model for comparison. We found a diminished loss in accuracy using shorter videos. The features associated with 60-minute videos had the best accuracy for vFI prediction (LMM where ‘simulation’ is the random effect, nsim = 50; lowest MAE, F_2,98 = 178.39, p < 2.2e − 16; lowest RMSE, F_2,98 = 156.93, p < 2.2e − 16); highest R2 (p < 2.2e − 16, F_2,98 = 297.3). We observed a significant drop in performance accuracy when the open field test length is reduced from 60 to 20-minute video (LMM with post hoc pairwise comparisons - MAE, t₉₈ = 14.82, FDR-adjusted p < 0.0001; RMSE, t₉₈ = 13.69, FDR-adjusted p < 0.0001; R2, t₉₈ = −19.22, FDR-adjusted p < 0.0001). E, To see how much training data is realistically needed, we performed a simulation study (n = 50 independent train-test splits) where we allocated different percentage of total data to training. As expected, there is a general downward (upward) trend in MAE, RMSE (R2 with an increasing percentage of data allocated to training set. Indeed, a smaller training set (< 80% training) can reach a similar training performance. In A, B, E and D, the lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles) respectively, the line in the middle corresponds to the median, the upper (lower) hinge extends from the upper (lower) hinge to the largest (smallest) value not bigger (smaller than 1.5 × IQR where IQR is the interquartile range.

Source data

Supplementary information

Supplementary Information

Supplementary Tables 1–5

Reporting Summary

Supplementary Video 1

Young mouse (8 weeks old) sample open-field video.

Supplementary Video 2

Old mouse (134 weeks old). Sample of open-field video of a mouse aged 134 weeks.

Supplementary Video 3

Flexibility metrics. At each frame, three points shown in yellow are estimated: base of head (A), mid-back (B) and base of tail (C). At each frame, the distance between points A and C (dAC, shown in red), distance between point B and the midpoint of line AC (dB, shown in green) and the angle formed by the points ABC (aABC, shown in blue) are calculated.

Supplementary Video 4

Rearing metrics. Rearing is called when the nose of the mice (marked with a red dot) crosses the perimeter of the open field (yellow line). Rearing is indicated by the presence of the red square in the upper corner of video.

Supplementary Table 1

Details the manual FI scoring.

Supplementary Software 1

ReadMe.txt: Contains this description. vFI_expts.R: Contains functions that define the steps for carrying out synthetic experiments I -V. The steps include performing train–test splits, fitting machine-learning models, using the fitted models to make predictions and reporting the performance measures on the test sets. vFI_Figures.R: Contains the script for recreating figures/plots in the paper. This file requires the synthetic experiments’ results, stored as CSV files. vFI_functions.R: This file contains all the main functions for carrying out all analyses in the paper. vFI_utils.R: It contains some utility functions, such as creating storage matrices for results from different experiments and saving them. vFI.R: The main file uses the previous files to carry out the analyses in the paper. It creates matrices containing the results and saves them as CSVs.

Source data

Source Data Fig. 1

Statistical source data.

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Fig. 4

Statistical source data.

Source Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 1

Statistical source data.

Source Data Extended Data Fig. 2

Statistical source data.

Source Data Extended Data Fig. 3

Statistical source data.

Source Data Extended Data Fig. 4

Statistical source data.

Source Data Extended Data Fig. 5

Statistical source data.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Hession, L.E., Sabnis, G.S., Churchill, G.A. et al. A machine-vision-based frailty index for mice. Nat Aging 2, 756–766 (2022). https://doi.org/10.1038/s43587-022-00266-0

Download citation

Received: 07 April 2021
Accepted: 05 July 2022
Published: 16 August 2022
Issue Date: August 2022
DOI: https://doi.org/10.1038/s43587-022-00266-0

This article is cited by

How is Big Data reshaping preclinical aging research?
- Maria Emilia Fernandez
- Jorge Martinez-Romero
- Rafael de Cabo
Lab Animal (2023)
Machine learning to spot frailty in aging mice
- Elise S. Bisset
- Susan E. Howlett
Nature Aging (2022)

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links