Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

The wisdom of the inner crowd in three large natural experiments

Abstract

The quality of decisions depends on the accuracy of estimates of relevant quantities. According to the wisdom of crowds principle, accurate estimates can be obtained by combining the judgements of different individuals1,2. This principle has been successfully applied to improve, for example, economic forecasts3,4,5, medical judgements6,7,8,9 and meteorological predictions10,11,12,13. Unfortunately, there are many situations in which it is infeasible to collect judgements of others. Recent research proposes that a similar principle applies to repeated judgements from the same person14. This paper tests this promising approach on a large scale in a real-world context. Using proprietary data comprising 1.2 million observations from three incentivized guessing competitions, we find that within-person aggregation indeed improves accuracy and that the method works better when there is a time delay between subsequent judgements. However, the benefit pales against that of between-person aggregation: the average of a large number of judgements from the same person is barely better than the average of two judgements from different people.

This is a preview of subscription content

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: MSE of the inner crowd and the outer crowd as a function of the number of included estimates.
Fig. 2
Fig. 3: Values of \({{\boldsymbol{T}}}_{{\boldsymbol{t}}}^{{\boldsymbol{* }}}\) for different delays.

References

  1. Surowicki, J. The Wisdom of Crowds. Why the Many Are Smarter Than the Few (Doubleday Books, New York, NY, 2004).

    Google Scholar 

  2. Page, S. E. The Difference: How the Power of Diversity Creates Better Groups, Firms, Schools, and Societies (Princeton Univ. Press, Princeton, NJ, 2007).

    Google Scholar 

  3. Clemen, R. T. Combining forecasts: a review and annotated bibliography. Int. J. Forecast. 5, 559–583 (1989).

    Article  Google Scholar 

  4. Armstrong, J. S. in Principles of Forecasting: A Handbook for Researchers and Practitioners (ed. Armstrong, J. S.) 417–439 (Kluwer Academic, Norwell, MA, 2001).

  5. Timmermann, A. in Handbook of Economic Forecasting Vol. 1 (eds Elliot, G. et al.) 135–196 (Elsevier, Amsterdam, 2006).

  6. Kurvers, R. H. J. M., Krause, J., Argenziano, G., Zalaudek, I. & Wolf, M. Detection accuracy of collective intelligence assessments for skin cancer diagnosis. JAMA Dermatol. 151, 1346–1353 (2015).

    Article  PubMed  Google Scholar 

  7. Wolf, M., Krause, J., Carney, P. A., Bogart, A. & Kurvers, R. H. J. M. Collective intelligence meets medical decision-making: the collective outperforms the best radiologist. PLoS ONE 10, e0134269 (2015).

    Google Scholar 

  8. Kurvers, R. H. J. M. et al. Boosting medical diagnostics by pooling independent judgments. Proc. Natl Acad. Sci. USA 113, 8777–8782 (2016).

  9. Kämmer, J. E., Hautz, W. E., Herzog, S. M., Kunina-Habenicht, O. & Kurvers, R. H. J. M. The potential of collective intelligence in emergency medicine: pooling medical students’ independent decisions improves diagnostic performance. Med. Decis. Making 37, 715–724 (2017).

    Article  PubMed  Google Scholar 

  10. Sanders, F. On subjective probability forecasting. J. Appl. Meteorol. 2, 191–201 (1963).

    Article  Google Scholar 

  11. Staël von Holstein, C.-A. An experiment in probabilistic weather forecasting. J. Appl. Meteorol. 10, 635–645 (1971).

    Article  Google Scholar 

  12. Vislocky, R. L. & Fritsch, J. M. Improved model output statistics forecasts through model consensus. Bull. Am. Meteorol. Soc. 76, 1157–1164 (1995).

    Article  Google Scholar 

  13. Baars, J. A. & Mass, C. F. Performance of national weather service forecasts compared to operational, consensus, and weighted model output statistics. Weather Forecast. 20, 1034–1047 (2005).

    Article  Google Scholar 

  14. Vul, E. & Pashler, H. Measuring the crowd within: probabilistic representations within individuals. Psychol. Sci. 19, 645–647 (2008).

    Article  PubMed  Google Scholar 

  15. Kelley, T. L. The applicability of the Spearman–Brown formula for the measurement of reliability. J. Educ. Psychol. 16, 300–303 (1925).

    Article  Google Scholar 

  16. Stroop, J. R. Is the judgment of the group better than that of the average member of the group? J. Exp. Psychol. 15, 550–562 (1932).

    Article  Google Scholar 

  17. Preston, M. G. Note on the reliability and the validity of the group judgment. J. Exp. Psychol. 22, 462–471 (1938).

    Article  Google Scholar 

  18. Eysenck, H. J. The validity of judgments as a function of the number of judges. J. Exp. Psychol. 25, 650–654 (1939).

    Article  Google Scholar 

  19. Hogarth, R. M. A note on aggregating opinions. Organ. Behav. Hum. Perform. 21, 40–46 (1978).

    Article  Google Scholar 

  20. Galton, F. Vox populi. Nature 75, 450–451 (1907).

    Article  Google Scholar 

  21. Galton, F. The ballot-box. Nature 75, 509–510 (1907).

    Article  Google Scholar 

  22. Galton, F. Memories of My Life (Methuen & Co, London, 1908).

    Book  Google Scholar 

  23. Gordon, K. Group judgments in the field of lifted weights. J. Exp. Psychol. 7, 398–400 (1924).

    Article  Google Scholar 

  24. Jenness, A. The role of discussion in changing opinion regarding a matter of fact. J. Abnorm. Soc. Psychol. 27, 279–296 (1932).

    Article  Google Scholar 

  25. Gordon, K. Further observations on group judgments of lifted weights. J. Psychol. 1, 105–115 (1935).

    Article  Google Scholar 

  26. Klugman, S. F. Group judgments for familiar and unfamiliar materials. J. Gen. Psychol. 32, 103–110 (1945).

    Article  Google Scholar 

  27. Treynor, J. L. Market efficiency and the bean jar experiment. Financ. Anal. J. 43, 50–53 (1987).

    Article  Google Scholar 

  28. Blackwell, C. & Pickford, R. The wisdom of the few or the wisdom of the many? An indirect test of the marginal trader hypothesis. J. Econ. Finan. 35, 164–180 (2011).

    Article  Google Scholar 

  29. Lorenz, J., Rauhut, H., Schweitzer, F. & Helbing, D. How social influence can undermine the wisdom of crowd effect. Proc. Natl Acad. Sci. USA 108, 9020–9025 (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  30. Ariely, D. et al. The effects of averaging subjective probability estimates between and within judges. J. Exp. Psychol. Appl. 6, 130–147 (2000).

    CAS  Article  PubMed  Google Scholar 

  31. Herzog, S. M. & Hertwig, R. The wisdom of many in one mind: Improving individual judgments with dialectical bootstrapping. Psychol. Sci. 20, 231–237 (2009).

    Article  PubMed  Google Scholar 

  32. Müller-Trede, J. Repeated judgment sampling: boundaries. Judgm. Decis. Mak. 6, 283–294 (2011).

    Google Scholar 

  33. Rauhut, H. & Lorenz, J. The wisdom of crowds in one mind: how individuals can simulate the knowledge of diverse societies to reach better decisions. J. Math. Psychol. 55, 191–197 (2011).

    Article  Google Scholar 

  34. Herzog, S. M. & Hertwig, R. Think twice and then: combining or choosing in dialectical bootstrapping? J. Exp. Psychol. Learn. Mem. Cogn. 40, 218–232 (2014).

    Article  PubMed  Google Scholar 

  35. Krueger, J. I. & Chen, L. J. The first cut is the deepest: effects of social projection and dialectical bootstrapping on judgmental accuracy. Soc. Cogn. 32, 315–336 (2014).

    Article  Google Scholar 

  36. Herzog, S. M. & Hertwig, R. Harnessing the wisdom of the inner crowd. Trends Cogn. Sci. 18, 504–506 (2014).

    Article  PubMed  Google Scholar 

  37. Dehaene, S., Izard, V., Spelke, E. & Pica, P. Log or linear? Distinct intuitions of the number scale in Western and Amazonian indigene cultures. Science 320, 1217–1220 (2008).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  38. Dehaene, S. Number Sense. How the Mind Creates Mathematics (Oxford Univ. Press, Oxford, 1997).

    Google Scholar 

  39. Nieder, A. Counting on neurons: the neurobiology of numerical competence. Nat. Rev. Neurosci. 6, 177–190 (2005).

    CAS  Article  PubMed  Google Scholar 

  40. Siegler, R. S. & Opfer, J. E. The development of numerical estimation: evidence for multiple representations of numerical quantity. Psychol. Sci. 14, 237–243 (2003).

    Article  PubMed  Google Scholar 

  41. Siegler, R. S. & Booth, J. L. Development of numerical estimation in young children. Child Dev. 75, 428–444 (2004).

    Article  PubMed  Google Scholar 

  42. Booth, J. L. & Siegler, R. S. Developmental and individual differences in pure numerical estimation. Dev. Psychol. 42, 189–201 (2006).

    Article  PubMed  Google Scholar 

  43. Bertelli, I., Lucangeli, D., Piazza, M., Dehaene, S. & Zorzi, M. Numerical estimation in preschoolers. Dev. Psychol. 46, 545–551 (2010).

    Article  Google Scholar 

  44. Hooker, R. Mean or median. Nature 75, 487–488 (1907).

    Article  Google Scholar 

  45. Genest, C. & Zidek, J. V. Combining probability distributions: a critique and an annotated bibliography. Stat. Sci. 1, 114–135 (1986).

    Article  Google Scholar 

  46. Dawid, A. P. et al. Coherent combination of experts’ opinions. Test 4, 263–313 (1995).

    Article  Google Scholar 

  47. Genre, V., Kenny, G., Meyler, A. & Timmermann, A. Combining expert forecasts: can anything beat the simple average? Int. J. Forecast. 29, 108–121 (2013).

    Article  Google Scholar 

  48. Baron, J., Mellers, B. A., Tetlock, P. E., Stone, E. & Ungar, L. H. Two reasons to make aggregated probability forecasts more extreme. Decis. Anal. 11, 133–145 (2014).

    Article  Google Scholar 

  49. Satopää, V. A. et al. Combining multiple probability predictions using a simple logit model. Int. J. Forecast. 30, 344–356 (2014).

    Article  Google Scholar 

  50. Larrick, R. P. & Soll, J. B. Intuitions about combining opinions: misappreciation of the averaging principle. Manage. Sci. 52, 111–127 (2006).

    Article  Google Scholar 

  51. Mannes, A. E. Are we wise about the wisdom of crowds? The use of group judgments in belief revision. Manage. Sci. 55, 1267–1279 (2009).

    Article  Google Scholar 

  52. Fraundorf, S. H. & Benjamin, A. S. Knowing the crowd within: metacognitive limits on combining multiple judgments. J. Mem. Lang. 71, 17–38 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  53. Hourihan, K. L. & Benjamin, A. S. Smaller is better (when sampling from the crowd within): low memory-span individuals benefit more from multiple opportunities for estimation. J. Exp. Psychol. Learn. Mem. Cogn. 36, 1068–1074 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  54. Steegen, S., Dewitte, L., Tuerlinckx, F. & Vanpaemel, W. Measuring the crowd within again: a pre-registered replication study. Front. Psychol. 5, 786 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  55. Krogh, A. & Vedelsby, J. in Advances in Neural Information Processing Systems Vol. 7 (eds Tesauro, G. et al.) 231–238 (MIT Press, Cambridge, MA, 1995).

Download references

Acknowledgements

We thank Holland Casino for providing the data, and A. Baillon, S. Herzog, A. Lucas, L. Molleman, A. Opschoor, R. Potter van Loon, V. Spinu, and L. Wolk for their constructive and valuable comments. The paper has benefited from discussions with seminar participants at the Max Planck Institute for Human Development, Carnegie Mellon University and the University of Nottingham, and with participants of the 2015 NIBS workshop, SPUDM 2015 Budapest, WESSI 2016 Abu Dhabi, IMEBESS 2016 Rome, TIBER 2016 Tilburg and BFWG 2017 London. We gratefully acknowledge support from the Netherlands Organisation for Scientific Research (NWO) and from the Economic and Social Research Council via the Network for Integrated Behavioural Sciences (ES/K002201/1). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

D.v.D. and M.J.v.d.A. designed the research, performed the research, contributed new analytic tools, analysed the data, and wrote the paper.

Corresponding author

Correspondence to Dennie van Dolder.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary Information

Supplementary Notes, Supplementary Notes 2, Supplementary Tables 1–4, Supplementary Figures 1–18

Life Sciences Reporting Summary

Experiment code

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

van Dolder, D., van den Assem, M.J. The wisdom of the inner crowd in three large natural experiments. Nat Hum Behav 2, 21–26 (2018). https://doi.org/10.1038/s41562-017-0247-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41562-017-0247-6

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing