A robust data-driven approach identifies four personality types across four large data sets


Understanding human personality has been a focus for philosophers and scientists for millennia1. It is now widely accepted that there are about five major personality domains that describe the personality profile of an individual2,3. In contrast to personality traits, the existence of personality types remains extremely controversial4. Despite the various purported personality types described in the literature, small sample sizes and the lack of reproducibility across data sets and methods have led to inconclusive results about personality types5,6. Here we develop an alternative approach to the identification of personality types, which we apply to four large data sets comprising more than 1.5 million participants. We find robust evidence for at least four distinct personality types, extending and refining previously suggested typologies. We show that these types appear as a small subset of a much more numerous set of spurious solutions in typical clustering approaches, highlighting principal limitations in the blind application of unsupervised machine learning methods to the analysis of big data.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Uncertainty in the ARC-type classification.
Fig. 2: Clustering reveals four meaningful personality types.
Fig. 3: Replicability of personality types in three independent data sets.
Fig. 4: The composition of four meaningful clusters is correlated with age and gender and is stable across different data sets.

Data availability

Data are available from https://osf.io/tbmh5/ (Johnson-300 and Johnson-120), http://mypersonality.org (myPersonality-100) and https://doi.org/10.5255/UKDA-SN-7656-1 (BBC-44).


  1. 1.

    Revelle, W., Wilt, J. & Condon, D. M. in The Wiley-Blackwell Handbook of Individual Differences (eds Chamorro-Premuzic, T. et al.) 1–38 (Wiley-Blackwell, Oxford, 2013).

  2. 2.

    McCrae, R. R. & Costa, P. T. in The SAGE Handbook of Personality Theory and Assessment: Volume 1 Personality Theories and Models (eds Boyle, G. J. et al.) 273–294 (SAGE, London, 2008).

  3. 3.

    Widiger, T. A. The Oxford Handbook of the Five Factor Model of Personality (Oxford Univ. Press, Oxford, 2015).

  4. 4.

    McCrae, R. R., Terracciano, A., Costa, P. T. & Ozer, D. J. Person-factors in the California adult Q-set: closing the door on personality trait types? Eur. J. Pers. 20, 29–44 (2006).

  5. 5.

    Donnellan, M. B. & Robins, R. W. Resilient, overcontrolled, and undercontrolled personality types: issues and controversies. Soc. Pers. Psychol. Compass 11, 1070–1083 (2010).

  6. 6.

    Specht, J., Luhmann, M. & Geiser, C. On the consistency of personality types across adulthood: latent profile analyses in two large-scale panel studies. J. Pers. Soc. Psychol. 107, 540–556 (2014).

  7. 7.

    Goldberg, L. R. An alternative “description of personality”: the Big-Five factor structure. J. Pers. Soc. Psychol. 59, 1216–1229 (1990).

  8. 8.

    Costa, P. T. & McCrae, R. R. NEO PI-R Professional Manual (Psychological Assessment Resources, Odessa, FL, 1992).

  9. 9.

    Ozer, D. J. & Benet-Martı́nez, V. Personality and the prediction of consequential outcomes. Annu. Rev. Psychol. 57, 401–421 (2006).

  10. 10.

    Widiger, T. A. & Costa, P. T. Jr. Personality Disorders and the Five-Factor Model of Personality 3rd edn (American Psychological Association, Washington DC, 2013).

  11. 11.

    Asendorpf, J. B., Borkenau, P., Ostendorf, F. & Van Aken, M. A. G. Carving personality description at its joints: confirmation of three replicable personality prototypes for both children and adults. Eur. J. Pers. 15, 169–198 (2001).

  12. 12.

    Robins, R. W., John, O. P., Caspi, A., Moffitt, T. E. & Stouthamer-Loeber, M. Resilient, overcontrolled, and undercontrolled boys: three replicable personality types. J. Pers. Soc. Psychol. 70, 157–171 (1996).

  13. 13.

    Caspi, A. & Silva, P. A. Temperamental qualities at age three predict personality traits in young adulthood: longitudinal evidence from a birth cohort. Child Dev. 66, 486–498 (1995).

  14. 14.

    Block, J. Lives Through Time (Bancroft Press, Berkeley, CA, 1971).

  15. 15.

    Costa, P. T., Herbst, J. H., McCrae, R. R., Samuels, J. & Ozer, D. J. The replicability and utility of three personality types. Eur. J. Pers. 16, S73–S87 (2002).

  16. 16.

    Herzberg, P. Y. & Roth, M. Beyond resilients, undercontrollers, and overcontrollers? An extension of personality prototype research. Eur. J. Pers. 20, 5–28 (2006).

  17. 17.

    Altman, N. & Krzywinski, M. Points of significance: clustering. Nat. Methods 14, 545–546 (2017).

  18. 18.

    Ashton, M. C. & Lee, K. An investigation of personality types within the HEXACO personality framework. J. Individ. Differ. 30, 181–187 (2009).

  19. 19.

    Isler, L., Fletcher, G. J. O., Liu, J. H. & Sibley, C. G. Validation of the four-profile configuration of personality types within the Five-Factor model. Pers. Individ. Dif. 106, 257–262 (2017).

  20. 20.

    Rentfrow, P. J. et al. Divided we stand: three psychological regions of the United States and their political, economic, social, and health correlates. J. Pers. Soc. Psychol. 105, 996–1012 (2013).

  21. 21.

    Rentfrow, P. J., Jokela, M. & Lamb, M. E. Regional personality differences in Great Britain. PLoS ONE 10, e0122245 (2015).

  22. 22.

    Revelle, W. et al. in SAGE Handbook of Online Research Methods (eds Fielding, N. G. et al.) 578–595 (SAGE, London, 2016).

  23. 23.

    Jain, A. K. Data clustering: 50 years beyond k-means. Pattern Recogn. Lett. 31, 651–666 (2010).

  24. 24.

    Goldberg, L. R. in Personality Psychology in Europe Vol. 7 (eds Mervielde, I., Deary, I., De Fruyt, F. & Ostendorf, F.) 7–28 (Tilburg Univ. Press, Tilburg, 1999).

  25. 25.

    Revelle, W. An Introduction to Psychometric Theory with Applications in R (Personality Project, 2017); http://www.personality-project.org/r/book/

  26. 26.

    Costa, P. T. & McCrae, R. in The Oxford Handbook of the Five Factor Model (ed. Widiger, T. A.) 1–52 (Oxford Univ. Press, Oxford, 2015).

  27. 27.

    Burnham, K. P. & Anderson, D. R. Model Selection and Multimodel Inference 2nd edn (Springer, New York, NY, 2002).

  28. 28.

    Schwarz, G. Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978).

  29. 29.

    Fortunato, S. & Barthelemy, M. Resolution limit in community detection. Proc. Natl Acad. Sci. USA 104, 36–41 (2007).

  30. 30.

    Lancichinetti, A. et al. A high-reproducibility and high-accuracy method for automated topic classification. Phys. Rev. X 5, 011007 (2015).

  31. 31.

    Horn, J. L. A rationale and test for the number of factors in factor analysis. Psychometrika 30, 179–185 (1965).

  32. 32.

    Xie, X., Chen, W., Lei, L., Xing, C. & Zhang, Y. The relationship between personality types and prosocial behavior and aggression in Chinese adolescents. Pers. Individ. Dif. 95, 56–61 (2016).

  33. 33.

    Terracciano, A., McCrae, R. R., Brent, L. J. & Costa, P. T. Hierarchical linear modeling analyses of the NEO-PI-R scales in the Baltimore longitudinal study of aging. Psychol. Aging 20, 493–506 (2005).

  34. 34.

    Meeus, W., Van de Schoot, R., Klimstra, T. & Branje, S. Personality types in adolescence: change and stability and links with adjustment and relationships: a five-wave longitudinal study. Dev. Psychol. 47, 1181–1195 (2011).

  35. 35.

    Eysenck, H. J. & Eysenck, M. W. Personality and Individual Differences: a Natural Science Approach (Plenum Press, New York, NY, 1985).

  36. 36.

    Johnson, J. A. Measuring thirty facets of the Five Factor model with a 120-item public domain inventory: development of the IPIP-NEO-120. J. Res. Pers. 51, 78–89 (2014).

  37. 37.

    Condon, D. M. The SAPA personality inventory: an empirically-derived, hierarchically-organized self-report personality assessment model. Preprint at https://psyarxiv.com/sc4p9/ (2018).

  38. 38.

    Vazire, S. & Mehl, M. Knowing me, knowing you: the accuracy and unique predictive validity of self-ratings and other-ratings of daily behavior. J. Pers. Soc. Psychol. 95, 1202–1216 (2008).

  39. 39.

    Paulhus, D. L. & Vazire, S. in Handbook of Research Methods in Personality Psychology (eds Robins, R. W. et al.) 224–239 (Guilford, New York, NY, 2007).

  40. 40.

    Chapman, B. & Goldberg, L. Replicability and 40-year predictive power of childhood ARC types. J. Pers. Soc. Psychol. 101, 593–606 (2011).

  41. 41.

    Steca, P., Alessandri, G. & Caprara, G. V. The utility of a well-known personality typology in studying successful aging: resilients, undercontrollers, and overcontrollers in old age. Pers. Individ. Dif. 48, 442–446 (2010).

  42. 42.

    Kosinski, M., Matz, S., Gosling, S., Popov, V. & Stillwell, D. Facebook as a social science research tool: opportunities, challenges, ethical considerations and practical guidelines. Am. Psychol. 70, 543–556 (2015).

  43. 43.

    University of Cambridge, Department of Psychology, British Broadcasting Corporation BBC Big Personality Test, 2009–2011: Dataset for Mapping Personality across Great Britain [data collection] (UK Data Service, 2015); https://doi.org/10.5255/UKDA-SN-7656-1

  44. 44.

    Gosling, S. D., Vazire, S., Srivastava, S. & John, O. P. Should we trust web-based studies? A comparative analysis of six preconceptions about internet questionnaires. Am. Psychol. 59, 93–104 (2004).

  45. 45.

    Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning 2nd edn (Springer, New York, NY, 2009).

  46. 46.

    Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

  47. 47.

    Kaiser, H. F. The varimax criterion for analytic rotation in factor analysis. Psychometrika 23, 187–200 (1958).

  48. 48.

    Factor rotation. Python code for factor rotation (GitHub, 2017); http://github.com/mvds314/factor_rotation

  49. 49.

    Carrol, J. An analytical solution for approximating simple structure in factor analysis. Psychometrika 18, 23–38 (1953).

  50. 50.

    Bishop, C. Pattern Recognition and Machine Learning (Springer, New York, NY, 2006).

Download references


L.A.N.A. thanks the John and Leslie McQuown Gift and support from the Department of Defense Army Research Office under grant number W911NF-14-1-0259. W.R.’s work was partially supported by a grant from the National Science Foundation: SMA-1419324. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. We thank J. Johnson for making the Johnson-300 and the Johnson-120 data sets publicly available; D. Stillwell, M. Kosinski and the myPersonality project for sharing the myPersonality-100 data; and the BBC LabUK for making the BBC-44 data set publicly available.

Author information

M.G., B.F., W.R. and L.A.N.A. designed the research. M.G., B.F., W.R. and L.A.N.A. performed the research. M.G. and B.F. analysed the data. M.G., W.R. and L.A.N.A. wrote the paper.

Correspondence to Luís A. Nunes Amaral.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figures 1–17; Supplementary Table 1; Supplementary Methods; Supplementary References

Reporting Summary

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gerlach, M., Farb, B., Revelle, W. et al. A robust data-driven approach identifies four personality types across four large data sets. Nat Hum Behav 2, 735–742 (2018). https://doi.org/10.1038/s41562-018-0419-z

Download citation

Further reading