Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Identity domains capture individual differences from across the behavioral repertoire


Personality traits can offer considerable insight into the biological basis of individual differences. However, existing approaches toward understanding personality across species rely on subjective criteria and limited sets of behavioral readouts, which result in noisy and often inconsistent outcomes. Here we introduce a mathematical framework for describing individual differences along dimensions with maximum consistency and discriminative power. We validate this framework in mice, using data from a system for high-throughput longitudinal monitoring of group-housed male mice that yields a variety of readouts from across the behavioral repertoire of individual animals. We demonstrate a set of stable traits that capture variability in behavior and gene expression in the brain, allowing for better-informed mechanistic investigations into the biology of individual differences.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: From behavior to personality.
Fig. 2: Testing the IDs.
Fig. 3: IDs are reflected in multiple standard behavioral tests.
Fig. 4: IDs carry information on gene expression in the brain.
Fig. 5: Personality space.

Similar content being viewed by others

Data availability

The RNA-seq data for this project have been deposited to the NCBI’s Sequence Read Archive (SRA) under the following accession number: PRJNA542512. The datasets generated during and/or analyzed during the current study are available from the corresponding author upon request.

Code availability

All the code used in the Matlab LDA implementation, including a demonstration of its use on the results from the original cohort of mice (n = 168), is publicly available at the following link: The color-based video tracking system will be made available upon request. Likewise, the self-similarity tests implemented in Matlab and the R code used in the RNA-seq data analysis will be made available upon reasonable request.


  1. Eysenck, J. H. The Structure of Human Personality (Methuen & Co., 1953).

  2. McCrae, R. R. & Costa, P. J. Personality in Adulthood: A Five-Factor Theory Perspective (Guilford Press, 2002).

  3. Shemesh, Y. et al. High-order social interactions in groups of mice. eLife 2, e00759 (2013).

    Article  Google Scholar 

  4. Shemesh, Y. et al. Ucn3 and CRF-R2 in the medial amygdala regulate complex social dynamics. Nat. Neurosci. 19, 1489–1496 (2016).

    Article  CAS  Google Scholar 

  5. Shoval, O. et al. Evolutionary trade-offs, Pareto optimality, and the geometry of phenotype space. Science 336, 1157–1160 (2012).

    Article  CAS  Google Scholar 

  6. Gallagher, T., Bjorness, T., Greene, R., You, Y. J. & Avery, L. The geometry of locomotive behavioral states in C. elegans. PLoS One 8, e59865 (2013).

    Article  CAS  Google Scholar 

  7. Krömer, S. A. et al. Identification of glyoxalase-I as a protein marker in a mouse model of extremes in trait anxiety. J. Neurosci. 25, 4375–4384 (2005).

    Article  Google Scholar 

  8. Butcher, J. N. Minnesota Multiphasic Personality Inventory. in The Corsini Encyclopedia of Psychology (eds Weiner, I. B. & Craighead, W. E.) (2010).

  9. McCrae, R. R., Costa, P. T., Del Pilar, G. H., Rolland, J.-P. & Parker, W. D. Cross-cultural assessment of the five-factor model. J. Cross Cult. Psychol. 29, 171–188 (1998).

    Article  Google Scholar 

  10. Triandis, H. C. & Suh, E. M. Cultural influences on personality. Annu. Rev. Psychol. 53, 133–160 (2002).

    Article  Google Scholar 

  11. Rothbart, M. K. Measurement of temperament in infancy. Child Dev. 52, 569–578 (1981).

    Article  Google Scholar 

  12. Hart, Y. et al. Inferring biological tasks using Pareto analysis of high-dimensional data. Nat. Methods 12, 233–235 (2015).

    Article  CAS  Google Scholar 

  13. Hinrich, J. L. et al. Archetypal analysis for modeling multisubject fMRI data. IEEE J. Sel. Top. Signal Process. 10, 1160–1171 (2016).

    Article  Google Scholar 

  14. Rabiner, L. R. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77, 257–286 (1989).

    Article  Google Scholar 

  15. Duda, R. O., Hart, P. E. & Stork, D. G. Pattern Classification (Wiley, 2012).

  16. Breiman, L., Friedman, J., Stone, C. J. & Olshen, R. A. Classification and Regression Trees (CRC Press, 1984).

  17. De Vries, H., Stevens, J. M. G. & Vervaecke, H. Measuring and testing the steepness of dominance hierarchies. Anim. Behav. 71, 585–592 (2006).

    Article  Google Scholar 

  18. Leger, M. et al. Object recognition test in mice. Nat. Protoc. 8, 2531–2537 (2013).

    Article  CAS  Google Scholar 

  19. Franklin, K. B. J & Paxinos, G. The Mouse Brain in Stereotaxic Coordinates (Academic Press, 1997).

  20. Andrews, S. et al. FastQC: a quality control tool for high throughput sequence data. Babraham Bioinformatics (2010).

  21. Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 10–12 (2011).

    Article  Google Scholar 

  22. Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).

    Article  CAS  Google Scholar 

  23. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    Article  Google Scholar 

  24. Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).

    Article  Google Scholar 

  25. Hoffman, G. E. & Schadt, E. E. variancePartition: interpreting drivers of variation in complex gene expression studies. BMC Bioinformatics 17, 483 (2016).

    Article  Google Scholar 

Download references


The authors thank N. Eren, I. Couzin and C. Wotjak for their assistance, advice and constructive criticism. They thank M. Engel for her technical assistance with the RNA-seq experiment. Thanks are also given to J. Keverne for professional English editing, formatting and scientific input. Their thanks also go to O. Maoz for his unique insights into the mathematics and their interpretation. Finally, the authors would like to extend special thanks to the recently passed Chaya Tannor for fascinating discussions on human personality. A.C. receives financial support from serving as the Vera and John Schwartz Family Professorial Chair at the Weizmann Institute and as the head of the Max Planck Society—Weizmann Institute of Science Laboratory for Experimental Neuropsychiatry and Behavioral Neurogenetics. This work is supported by the following grants and agencies (to A.C.): a FP7 Grant from the European Research Council (260463); the Israel Science Foundation (1565/15); the ERANET Program; the Chief Scientist Office of the Israeli Ministry of Health; the Federal Ministry of Education and Research (01KU1501A); Roberto and Renata Ruhman; Bruno and Simone Licht; the I-CORE Program of the Planning and Budgeting Committee and The Israel Science Foundation (grant no. 1916/12); the Nella and Leon Benoziyo Center for Neurological Diseases; the Henry Chanoch Krenter Institute for Biomedical Imaging and Genomics; the Perlman Family Foundation, founded by Louis L. and Anita M. Perlman; the Adelis Foundation; the Marc Besen and the Pratt Foundation; and the Irving I. Moskowitz Foundation. S.K. is supported by the International Max Planck Research School for Translational Psychiatry (IMPRS-TP).

Author information

Authors and Affiliations



O.F and S.K. designed the experiments, analyzed the results and wrote the manuscript. C.T. contributed to the design of the behavioral experiments. M.N., C.F. and P.M.K. assisted in experiments. S.R. performed the preprocessing of the RNA-seq data and contributed to the final analyses. U.A., S.A. and Y.S. contributed to the manuscript. A.C. supervised and supported the project.

Corresponding author

Correspondence to Alon Chen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Neuroscience thanks Ann Kennedy and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 Individual differences and consistencies.

(a) Behavioral readout structure. Hierarchical clustering and cross-correlations of the 60 behavioral readouts for n=168 mice. Behavioral readouts tend to cluster based on whether they are independent (related to 1 mouse) or pairwise (derived from the locations of 2 mice). (b) Some behavioral parameters were consistent within individuals over time (for example, contact rate), some parameters could discriminate between individuals (for example, number of chases: F(1,643)=43.7, p=7.8·10-11, time near food or water: F(1,668)=136.9, p=6.7·10-29), while others could discriminate between different times (for example, contact duration F(1,567)=74.9, p=5.1·10-17). Several parameters satisfied both conditions (for example, mean speed: interaction effect is F(1,658)=3.9, p=4.9·10-2, wall-distance: identity effect is F(1,662)=7.5, p=6.3·10-3, day effect is F(1,662)=5.9, p=1.5·10-2 ; ***p<0.001, *p<0.05; all tests were performed on n=168 individuals using a two-way ANOVA. In the box plots, boxes represent the 25%, 50% (median) and 75% quantiles and whiskers span from minima to maxima.).

Supplementary Figure 2 Between-within variability ratio.

Identity domain (ID) components ranked by their Fisher-Rao coefficient. Four components had a Fisher-Rao-score below 4, indicating a greater contribution of between over within-individual variability.

Supplementary Figure 3 Validation of the identity domains (IDs) in a second dataset from a different setup.

(a) Alternative social arena (50 x 70 cm) with a different locations and types of objects compared to arena shown in Fig. 1b. (b) IDs 1-4 show intermediate to strong correlations between the original and replication datasets (ρ denotes the Pearson correlation coefficient between the sets). Each point represents the score of a mouse tested in the original setup using either the original or alternative projection matrix. Alternative IDs were computed using a projection matrix estimated based on the behaviors of mice in the alternative setup (n = 208 individuals). In order to make use of the same projection matrix, the number of behaviors in the original groups were restricted to the same 37 that could be collected from the alternative setup.

Supplementary Figure 4 Identity domain (ID) stability over a short timescale.

IDs were stable over experimental time, such that average ID scores for experimental days 1 through 4 could predict the corresponding scores for each animal on day 5.

Supplementary Figure 5 ID score change over time with respect to self or others.

ID stability during aging was tested by comparing the ID scores of individuals measured once juveniles (4-5 weeks old) and once more during adulthood (15-16 weeks old). Depicted here are change in ID scores relative to one’s own initial score versus relative to the scores of all other individuals (p-values are computed using a one-sided permutation test; n=32 individuals). Points in the shaded region represent greater individual changes, whereas points in the unshaded region represent changes that were larger relative to other individuals than to oneself.

Supplementary Figure 6 Group shuffle diagram.

Mice were observed in the social boxes over 4 days and re-grouped on day 5 such that no mouse was familiar with any of its new conspecifics (n = 64, 16 groups).

Supplementary Figure 7 Principal component analysis (PCA) on the initial set of behaviors.

In order to compare how LDA performs relative to better-known and more commonly used dimension reduction method, PCA was performed on the same initial dataset as used to generate the IDs. (a) The percent variance of the behavioral data explained by each principal component (PC). (b) Correlations between scores on each PC and an abbreviated list of behavioral readouts. (c) The stability of PC scores was tested as with the IDs before and after mixing the mouse groups such that all individuals were unfamiliar to one another. Only the first principal component remained stable after the mix (one-sided permutation test, n=64 individuals). (d) Scores on the first four PCs were used as predictors of transcriptomic variance in RNA-sequencing data from three different brain regions. This analysis directly mimicked the equivalent analysis performed using the four IDs (PC scores from day 1, 200 shuffled PC score sets; randomization test with n=32 individuals). The top four PCs did not carry more overall transcriptomic information than would be expected by chance.

Supplementary Figure 8 High-anxiety (HAB) versus normal-anxiety (NAB) mouse model.

(a) Selective breeding for high versus normal anxiety-like behavior levels (HAB/NAB) was performed for over > 40 generations starting with outbred CD-1 mice7. Selection was based on results of the Elevated Plus-Maze test (% time in the open arm). After the animals of each respective genotype were weaned, they were mixed into groups of three NABs and one HAB each. (b) The power of the identity domain (ID) scores to detect genotype was tested directly using the area under the receiver operating characteristic curve of a model predicting genotype based on IDs 1-4. The area under the curve of this model was compared against a distribution created based on 200 trials with shuffled ID scores.

Supplementary Figure 9 Correlations between identity domains (IDs) and their contributing behavioral readouts.

The readouts are separated into individual (based on the movements of a single mouse) and pairwise (based on the movements of a mouse and one more of its group members).

Supplementary information

Supplementary Figs. 1–9.

Reporting Summary

Supplementary Video 1

Tracking multiple individuals in a semi-naturalistic environment. A representative segment taken from a video recording of the social arena with a group of four fur-dyed mice. Overlaid on the video are illustrations of tracked mouse locations and the layout and components of the social arena.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Forkosh, O., Karamihalev, S., Roeh, S. et al. Identity domains capture individual differences from across the behavioral repertoire. Nat Neurosci 22, 2023–2028 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing