Precision psychiatry demands the rapid, efficient, and temporally dense collection of large scale and multi-omic data across diverse samples, for better diagnosis and treatment of dynamic clinical phenomena. To achieve this, we need approaches for measuring behavior that are readily scalable, both across participants and over time. Efforts to quantify behavior at scale are impeded by the fact that our methods for measuring human behavior are typically developed and validated for single time-point assessment, in highly controlled settings, and with relatively homogeneous samples. As a result, when taken to scale, these measures often suffer from poor reliability, generalizability, and participant engagement. In this review, we attempt to bridge the gap between gold standard behavioral measurements in the lab or clinic and the large-scale, high frequency assessments needed for precision psychiatry. To do this, we introduce and integrate two frameworks for the translation and validation of behavioral measurements. First, borrowing principles from computer science, we lay out an approach for iterative task development that can optimize behavioral measures based on psychometric, accessibility, and engagement criteria. Second, we advocate for a participatory research framework (e.g., citizen science) that can accelerate task development as well as make large-scale behavioral research more equitable and feasible. Finally, we suggest opportunities enabled by scalable behavioral research to move beyond single time-point assessment and toward dynamic models of behavior that more closely match clinical phenomena.
This is a preview of subscription content
Subscribe to Journal
Get full journal access for 1 year
only $9.15 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Redish AD, Gordon JA, editors. Computational psychiatry: new perspectives on mental illness. Cambridge, MA: MIT Press; 2016.
Yehia L, Eng C. Largescale population genomics versus deep phenotyping: brute force or elegant pragmatism towards precision medicine. NPJ Genome Med. 2019;4:1–2.
Open Science Collaboration. Estimating the reproducibility of psychological science. Science. 2015;349:aac4716.
Button KS, Ioannidis JP, Mokrysz C, Nosek BA, Flint J, Robinson ES, et al. Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. 2013;14:365–76.
Szucs D, Ioannidis JP. Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. PLoS Biol. 2017;15:1–18.
Henrich J, Heine SJ, Norenzayan A. Beyond WEIRD: towards a broad-based behavioral science. Behav Brain Sci. 2010;33:111.
Williams DR, Jackson PB. Social sources of racial disparities in health. Health Aff. 2005;24:325–34.
Brown G, Marshall M, Bower P, Woodham A, Waheed W. Barriers to recruiting ethnic minorities to mental health research: a systematic review. Int J Meth Psychiatr Res. 2014;23:36–48.
Arean PA, Alvidrez J, Nery R, Estes C, Linkins K. Recruitment and retention of older minorities in mental health services research. Gerontologist. 2003;43:36–44.
Chen H, Kramer EJ, Chen T, Chung H. Engaging Asian Americans for mental health research: challenges and solutions. J Immigr Health. 2005;7:109–18.
Le HN, Lara MA, Perry DF. Recruiting Latino women in the US and women in Mexico in postpartum depression prevention research. Arch Women’s Ment Health. 2008;11:159–69.
Miranda J. Introduction to the special section on recruiting and retaining minorities in psychotherapy research. J Consult Clin Psychol. 1996;64:848.
Cohen RA, Sparling-Cohen YA, O’Donnell BF. The neuropsychology of attention. New York, NY: Plenum Press; 1993.
Torous J, Nicholas J, Larsen ME, Firth J, Christensen H. Clinical review of user engagement with mental health smartphone apps: evidence, theory and improvements. Evid Based Ment Health. 2018;21:116–9.
Ng MM, Firth J, Minen M, Torous J. User engagement in mental health apps: a review of measurement, reporting, and validity. Psychiatr Serv. 2019;70:538–44.
Apodaca R, Lea S, Edwards B. The effect of longitudinal burden on survey participation. In: Proceedings of the Survey Research Methods Section. American Statistical Association; 1998.
Kerr DC, Ornelas IJ, Lilly MM, Calhoun R, Meischke H. Participant engagement in and perspectives on a web-based mindfulness intervention for 9-1-1 telecommunicators: multimethod study. J Med Internet Res. 2019;21:e13449.
Yancey AK, Ortega AN, Kumanyika SK. Effective recruitment and retention of minority research participants. Annu Rev Public Health. 2006;27:1–28.
Gilliss CL, Lee KA, Gutierrez Y, Taylor D, Beyene Y, Neuhaus J, et al. Recruitment and retention of healthy minority women into community-based longitudinal research. J Wom Health Gend Base Med. 2001;10:77–85.
Musthag M, Raij A, Ganesan D, Kumar S, Shiffman S. Exploring micro-incentive strategies for participant compensation in high-burden studies. In: Proceedings of the 13th International Conference on Ubiquitous Computing; 2011. p. 435–44.
Loxton D, Young A. Longitudinal survey development and design. Int J Mult Res Approaches. 2007;1:114–25.
Anguera JA, Jordan JT, Castaneda D, Gazzaley A, Areán PA. Conducting a fully mobile and randomised clinical trial for depression: access, engagement and expense. BMJ Innov. 2016;2:14–21.
Ejiogu N, Norbeck JH, Mason MA, Cromwell BC, Zonderman AB, Evans MK. Recruitment and retention strategies for minority or poor clinical research participants: lessons from the Healthy Aging in Neighborhoods of Diversity across the Life Span study. Gerontologist. 2011;51:S33–45.
Loue S, Sajatovic M. Research with severely mentally ill Latinas: successful recruitment and retention strategies. J Immigr Minor Health. 2008;10:145–53.
Anderson ML, Riker T, Hakulin S, Meehan J, Gagne K, Higgins T, et al. Deaf ACCESS: adapting consent through community engagement and state-of-the-art simulation. J Def Stud Deaf Educ. 2020;25:115–25.
Deering S, Grade MM, Uppal JK, Foschini L, Juusola JL, Amdur AM, et al. Accelerating research with technology: rapid recruitment for a large-scale web-based sleep study. JMIR Res Protoc. 2019;8:e10974.
Zaphiris P, Kurniawan S, Ghiawadwala M. A systematic approach to the development of research-based web design guidelines for older people. Univers Access Inf Soc. 2007;6:59.
Friedman MG, Bryen DN. Web accessibility design recommendations for people with cognitive disabilities. Technol Disabil. 2007;19:205–12.
Bernard R, Sabariego C, Cieza A. Barriers and facilitation measures related to people with mental disorders when using the web: a systematic review. J Med Internet Res. 2016;18:e157.
Akoumianakis D, Stephanidis C. Universal design in HCI: a critical review of current research and practice. Eng Constr. 1989;754.
McCarthy JE, Swierenga SJ. What we know about dyslexia and web accessibility: a research review. Univers Access Inf Soc. 2010;9:147–52.
Nordhoff M, August T, Oliveira NA, Reinecke K. A case for design localization: diversity of website aesthetics in 44 countries. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 2018. p. 1–12.
Gajos KZ, Chauncey K. The influence of personality traits and cognitive load on the use of adaptive user interfaces. In: Proceedings of the 22nd International Conference on Intelligent User Interfaces. 2017. p. 301–6.
Eraslan S, Yaneva V, Yesilada Y, Harper S. Web users with autism: eye tracking evidence for differences. Behav Inf Technol. 2019;38:678–700.
Schwartz AE, Kramer JM, Longo AL. Patient‐reported outcome measures for young people with developmental disabilities: incorporation of design features to reduce cognitive demands. Dev Med Child Neurol. 2018;60:173–84.
Hawthorn D. Interface design and engagement with older people. Behav Inf Technol. 2007;26:333–41.
Lindgaard G, Dudek C, Sen D, Sumegi L, Noonan P. An exploration of relations between visual appeal, trustworthiness and perceived usability of homepages. ACM Trans Comput Hum Interact. 2011;18:1–30.
Finnerty A, Kucherbaev P, Tranquillini S, Convertino G. Keep it simple: reward and task design in crowdsourcing. In: Proceedings of the biannual conference of the Italian chapter of SIGCHI. New York, NY: Association for Computing Machinery; 2013. p.1–4.
Kosslyn SM, Cacioppo JT, Davidson RJ, Hugdahl K, Lovallo WR, Spiegel D, et al. Bridging psychology and biology: the analysis of individuals in groups. Am Psychol. 2002;57:341.
Enkavi AZ, Eisenberg IW, Bissett PG, Mazza GL, MacKinnon DP, Marsch LA, et al. Large-scale analysis of test–retest reliabilities of self-regulation measures. Proc Nat Acad Sci. 2019;116:5472–7.
Hedge C, Powell G, Sumner P. The reliability paradox: why robust cognitive tasks do not produce reliable individual differences. Behav Res Methods. 2018;50:1166–86.
McNally RJ. Attentional bias for threat: crisis or opportunity? Clin Psychol Rev. 2019;69:4–13.
Parsons S, Kruijt AW, Fox E. Psychological science needs a standard practice of reporting the reliability of cognitive-behavioral measurements. Adv Methods Pract Psychol Sci. 2019;2:378–95.
Passell E, Dillon DG, Baker JT, Vogel SC, Scheuer LS, Mirin NL, et al. Digital cognitive assessment: results from the TestMyBrain NIMH Research Domain Criteria (RDoC) field test battery report. Psyarxiv. 2019. https://doi.org/10.31234/osf.io/dcszr.
Plomin R, Kosslyn SM. Genes, brain and cognition. Nat Neurosci 2001;4:1153–4.
Rodebaugh TL, Scullin RB, Langer JK, Dixon DJ, Huppert JD, Bernstein A, et al. Unreliability as a threat to understanding psychopathology: the cautionary tale of attentional bias. J Abnorm Psychol. 2016;125:840.
Kappenman ES, Farrens JL, Luck SJ, Proudfit GH. Behavioral and ERP measures of attentional bias to threat in the dot-probe task: poor reliability and lack of correlation with anxiety. Front Psychol. 2014;5:1368.
Waechter S, Nelson AL, Wright C, Hyatt A, Oakman J. Measuring attentional bias to threat: reliability of dot probe and eye movement indices. Cogn Ther Res. 2014;38:313–33.
Cronbach LJ, Meehl PE. Construct validity in psychological tests. Psychol Bull. 1955;52:281.
Basil VR, Turner AJ. Iterative enhancement: a practical technique for software development. IEEE Trans Softw Eng. 1975;4:390–6.
Nielsen J. Iterative user-interface design. Computer. 1993;26:32–41.
Kohavi R, Longbotham R. Online controlled experiments and A/B testing. Encycl Mach Learn Data Min. 2017;7:922–9.
Condon DM, Revelle W. The international cognitive ability resource: development and initial validation of a public-domain measure. Intelligence. 2014;43:52–64.
Condon DM, Revelle W. Selected ICAR data from the SAPA-Project: development and initial validation of a public-domain measure. J Open Psychol Data. 2016;4.
Baribault B, Donkin C, Little DR, Trueblood JS, Oravecz Z, van Ravenzwaaij D, et al. Metastudies for robust tests of theory. Proc Nat Acad Sci. 2018;115:2607–12.
Germine L, Reinecke K, Chaytor NS. Digital neuropsychology: challenges and opportunities at the intersection of science and software. Clin Neuropsychol. 2019;33:271–86.
Beukenhorst AL, Howells K, Cook L, McBeth J, O’Neill TW, Parkes MJ, et al. Engagement and participant experiences with consumer smartwatches for health research: Longitudinal, Observational Feasibility Study. JMIR mHealth uHealth. 2020;8:e14368.
Buhrmester M, Kwang T, Gosling SD. Amazon’s Mechanical Turk: a new source of inexpensive, yet high-quality data? In: Kazdin E, editor. Methodological issues and strategies in clinical research. 2016. p. 133–9.
Palan S, Schitter C. Prolific. ac—a subject pool for online experiments. J Behav Exp Financ. 2018;17:22–7.
Van Pelt C, Sorokin A. Designing a scalable crowdsourcing platform. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. 2012. p. 765–6.
Cornwall A, Jewkes R. What is participatory research? Soc Sci Med. 1995;41:1667–76.
Minkler M, Wallerstein N, editors. Community-based participatory research for health: from process to outcomes. San Francisco, CA: John Wiley & Sons; 2011.
Horowitz CR, Robinson M, Seifer S. Community-based participatory research from the margin to the mainstream: are researchers prepared? Circulation. 2009;119:2633–42.
Duchaine B, Germine L, Nakayama K. Family resemblance: ten family members with prosopagnosia and within-class object agnosia. Cogn Neuropsychol. 2007;24:419–30.
Germine L, Nakayama K, Duchaine BC, Chabris CF, Chatterjee G, Wilmer JB. Is the Web as good as the lab? Comparable performance from Web and lab in cognitive/perceptual experiments. Psychon Bull Rev. 2012;19:847–57.
Oliveira N, Jun E, Reinecke K. Citizen science opportunities in volunteer-based online experiments. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 2017. p. 6800–12.
Hartshorne JK, Germine LT. When does cognitive functioning peak? The asynchronous rise and fall of different cognitive abilities across the life span. Psychol Sci. 2015;26:433–43.
Jun E, Hsieh G, Reinecke K. Types of motivation affect study selection, attention, and dropouts in online experiments. Proc ACM Hum Comput Interact. 2017;1:1–5.
Li Q, Gajos KZ, Reinecke K. Volunteer-based online studies with older adults and people with disabilities. In: Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility. 2018. p. 229–41.
Ye T, Reinecke K, Robert Jr LP. Personalized feedback versus money: the effect on reliability of subjective data in online experimental platforms. In: Companion of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. New York, NY: Association for Computing Machinery; 2017. p. 343–6.
Fabsitz RR, McGuire A, Sharp RR, Puggal M, Beskow LM, Biesecker LG, et al. Ethical and practical guidelines for reporting genetic research results to study participants: updated guidelines from a National Heart, Lung, and Blood Institute working group. Circ Cardiovasc Genet. 2010;3:574–80.
Wallace SE, Kent A. Population biobanks and returning individual research results: mission impossible or new directions? Hum Genet. 2011;130:393–401.
Burke W, Evans BJ, Jarvik GP. Return of results: ethical and legal distinctions between research and clinical care. Am J Med Genet Part C Semin Med Genet. 2014;166C:105–11.
Fernandez CV, Kodish E, Weijer C. Informing study participants of research results: an ethical imperative. IRB: Ethics Hum Res. 2003;25:12–9.
Jarvik GP, Amendola LM, Berg JS, Brothers K, Clayton EW, Chung W, et al. Return of genomic results to research participants: the floor, the ceiling, and the choices in between. Am J Hum Genet. 2014;94:818–26.
Sankar PL, Parker LS. The Precision Medicine Initiative’s All of Us Research Program: an agenda for research on its ethical, legal, and social issues. Genet Med. 2017;19:743–50.
Wong CA, Hernandez AF, Califf RM. Return of research results to study participants: uncharted and untested. JAMA. 2018;320:435–6.
Macdonald K, Germine L, Anderson A, Christodoulou J, McGrath LM. Dispelling the myth: Training in education or neuroscience decreases but does not eliminate beliefs in neuromyths. Front Psychol. 2017;8:1314.
Reinecke K, Gajos KZ. LabintheWild: conducting large-scale online experiments with uncompensated samples. In: Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing. 2015. p. 1364–78.
Xu K, Nosek B, Greenwald A. Psychology data from the race implicit association test on the project implicit demo website. J Open Psychol Data. 2014;2.
Thornton MA, Tamir D. Six dimensions describe action understanding: the ACT-FASTaxonomy. PsyArxiv. 2019. https://doi.org/10.31234/osf.io/gt6bw.
Molenaar PC, Campbell CG. The new person-specific paradigm in psychology. Cur Dir Psychol. 2009;18:112–7.
Sliwinski MJ. Measurement‐burst designs for social health research. Soc Pers Psychol Compass. 2008;2:245–61.
Shiffman S, Stone AA, Hufford MR. Ecological momentary assessment. Annu Rev Clin Psychol. 2008;4:1–32.
Russell MA, Gajos JM. Annual research review: ecological momentary assessment studies in child psychology and psychiatry. J Child Psychol Psychiatry. 2020;61:376–94.
Heron KE, Smyth JM. Ecological momentary interventions: incorporating mobile technology into psychosocial and health behaviour treatments. Br J Health Psychol. 2010;15:1–39.
Sliwinski MJ, Mogle JA, Hyun J, Munoz E, Smyth JM, Lipton RB. Reliability and validity of ambulatory cognitive assessments. Assessment. 2018;25:14–30.
Ruderman D. The emergence of dynamic phenotyping. Cell Biol Toxicol. 2017;33:507–9.
Ram N, Gerstorf D. Time-structured and net intraindividual variability: tools for examining the development of dynamic characteristics and processes. Psychol Aging. 2009;24:778.
Baker JT, Germine LT, Ressler KJ, Rauch SL, Carlezon WA. Digital devices and continuous telemetry: opportunities for aligning psychiatry and neuroscience. Neuropsychopharmacology. 2018;43:2499–503.
Onnela JP, Rauch SL. Harnessing smartphone-based digital phenotyping to enhance behavioral and mental health. Neuropsychopharmacology. 2016;41:1691–6.
Barnett I, Torous J, Staples P, Sandoval L, Keshavan M, Onnela JP. Relapse prediction in schizophrenia through digital phenotyping: a pilot study. Neuropsychopharmacology. 2018;43:1660–6.
McCoy TH, Castro VM, Roberson AM, Snapper LA, Perlis RH. Improving prediction of suicide and accidental death after discharge from general hospitals with natural language processing. JAMA Psychiatr. 2016;73:1064–71.
Bedi G, Carrillo F, Cecchi GA, Slezak DF, Sigman M, Mota NB, et al. Automated analysis of free speech predicts psychosis onset in high-risk youths. NPJ Schizophr. 2015;1:15030.
Corcoran CM, Carrillo F, Fernández‐Slezak D, Bedi G, Klim C, Javitt DC, et al. Prediction of psychosis across protocols and risk cohorts using automated language analysis. World Psychiatr. 2018;17:67–75.
Murphy E, King EA. Testing the accuracy of smartphones and sound level meter applications for measuring environmental noise. Appl Acoust. 2016;106:16–22.
Harati S, Crowell A, Mayberg H, Kong J, Nemati S. Discriminating clinical phases of recovery from major depressive disorder using the dynamics of facial expression. In: Proceedings of the 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). 2016. p. 2254–7.
Campbell K, Carpenter KL, Hashemi J, Espinosa S, Marsan S, Borg JS, et al. Computer vision analysis captures atypical attention in toddlers with autism. Autism. 2019;23:619–28.
Jones SH, Hare DJ, Evershed K. Actigraphic assessment of circadian activity and sleep patterns in bipolar disorder. Bipolar Disord. 2005;7:176–86.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Germine, L., Strong, R.W., Singh, S. et al. Toward dynamic phenotypes and the scalable measurement of human behavior. Neuropsychopharmacol. 46, 209–216 (2021). https://doi.org/10.1038/s41386-020-0757-1