Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

The Moral Machine experiment

Matters Arising to this article was published on 04 March 2020

Abstract

With the rapid development of artificial intelligence have come concerns about how machines will make moral decisions, and the major challenge of quantifying societal expectations about the ethical principles that should guide machine behaviour. To address this challenge, we deployed the Moral Machine, an online experimental platform designed to explore the moral dilemmas faced by autonomous vehicles. This platform gathered 40 million decisions in ten languages from millions of people in 233 countries and territories. Here we describe the results of this experiment. First, we summarize global moral preferences. Second, we document individual variations in preferences, based on respondents’ demographics. Third, we report cross-cultural ethical variation, and uncover three major clusters of countries. Fourth, we show that these differences correlate with modern institutions and deep cultural traits. We discuss how these preferences can contribute to developing global, socially acceptable principles for machine ethics. All data used in this article are publicly available.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Coverage and interface.
Fig. 2: Global preferences.
Fig. 3: Country-level clusters.
Fig. 4: Association between Moral Machine preferences and other variables at the country level.

Data availability

Source data and code that can be used to reproduce Figs. 24, Extended Data Figs. 17, Extended Data Tables 1, 2, Supplementary Figs. 321, and Supplementary Table 2 are all available at the following link: https://goo.gl/JXRrBP. The provided data, both at the individual level (anonymized IDs) and the country level, can be used beyond replication to answer follow-up research questions.

References

  1. Greene, J. Moral Tribes: Emotion, Reason and the Gap Between Us and Them (Atlantic Books, London, 2013).

  2. Tomasello, M. A Natural History of Human Thinking (Harvard Univ. Press, Cambridge, 2014).

  3. Cushman, F. & Young, L. The psychology of dilemmas and the philosophy of morality. Ethical Theory Moral Pract. 12, 9–24 (2009).

    Article  Google Scholar 

  4. Asimov, I. I, Robot (Doubleday, New York, 1950).

  5. Bryson, J. & Winfield, A. Standardizing ethical design for artificial intelligence and autonomous systems. Computer 50, 116–119 (2017).

    Article  Google Scholar 

  6. Wiener, N. Some moral and technical consequences of automation. Science 131, 1355–1358 (1960).

    Article  ADS  CAS  Google Scholar 

  7. Wallach, W. & Allen, C. Moral Machines: Teaching Robots Right from Wrong (Oxford Univ. Press, Oxford, 2008).

  8. Dignum, V. Responsible autonomy. In Proc. 26th International Joint Conference on Artificial Intelligence 4698–4704 (IJCAI, 2017).

  9. Dadich, S. Barack Obama, neural nets, self-driving cars, and the future of the world. Wired https://www.wired.com/2016/10/president-obama-mit-joi-ito-interview/ (2016).

  10. Shariff, A., Bonnefon, J.-F. & Rahwan, I. Psychological roadblocks to the adoption of self-driving vehicles. Nat. Hum. Behav. 1, 694–696 (2017).

    Article  Google Scholar 

  11. Conitzer, V., Brill, M. & Freeman, R. Crowdsourcing societal tradeoffs. In Proc. 2015 International Conference on Autonomous Agents and Multiagent Systems 1213–1217 (IFAAMAS, 2015).

  12. Bonnefon, J.-F., Shariff, A. & Rahwan, I. The social dilemma of autonomous vehicles. Science 352, 1573–1576 (2016).

    Article  ADS  CAS  Google Scholar 

  13. Hauser, M., Cushman, F., Young, L., Jin, K.-X. R. & Mikhail, J. A dissociation between moral judgments and justifications. Mind Lang. 22, 1–21 (2007).

    Article  Google Scholar 

  14. Carlsson, F., Daruvala, D. & Jaldell, H. Preferences for lives, injuries, and age: a stated preference survey. Accid. Anal. Prev. 42, 1814–1821 (2010).

    Article  Google Scholar 

  15. Johansson-Stenman, O. & Martinsson, P. Are some lives more valuable? An ethical preferences approach. J. Health Econ. 27, 739–752 (2008).

    Article  Google Scholar 

  16. Johansson-Stenman, O., Mahmud, M. & Martinsson, P. Saving lives versus life-years in rural Bangladesh: an ethical preferences approach. Health Econ. 20, 723–736 (2011).

    Article  Google Scholar 

  17. Graham, J., Meindl, P., Beall, E., Johnson, K. M. & Zhang, L. Cultural differences in moral judgment and behavior, across and within societies. Curr. Opin. Psychol. 8, 125–130 (2016).

    Article  Google Scholar 

  18. Hainmueller, J., Hopkins, D. J. & Yamamoto, T. Causal inference in conjoint analysis: understanding multidimensional choices via stated preference experiments. Polit. Anal. 22, 1–30 (2014).

    Article  Google Scholar 

  19. Luetge, C. The German Ethics Code for automated and connected driving. Philos. Technol. 30, 547–558 (2017).

    Article  Google Scholar 

  20. Müllner, D. Modern hierarchical, agglomerative clustering algorithms. Preprint at https://arxiv.org/abs/1109.2378 (2011).

  21. Inglehart, R. & Welzel, C. Modernization, Cultural Change, and Democracy: The Human Development Sequence (Cambridge Univ. Press, Cambridge, 2005).

  22. Muthukrishna, M. Beyond WEIRD psychology: measuring and mapping scales of cultural and psychological distance. Preprint at https://ssrn.com/abstract=3259613 (2018).

  23. Hofstede, G. Culture’s Consequences: Comparing Values, Behaviors, Institutions and Organizations Across Nations (Sage, Thousand Oaks, 2003).

  24. International Monetary Fund. World Economic Outlook Database https://www.imf.org/external/pubs/ft/weo/2017/01/weodata/index.aspx (2017).

  25. Kaufmann, D., Kraay, A. & Mastruzzi, M. The worldwide governance indicators: methodology and analytical issues. Hague J. Rule Law 3, 220–246 (2011).

    Article  Google Scholar 

  26. Gächter, S. & Schulz, J. F. Intrinsic honesty and the prevalence of rule violations across societies. Nature 531, 496–499 (2016).

    Article  ADS  Google Scholar 

  27. O’Neil, C. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (Penguin, London, 2016).

  28. Henrich, J. et al. In search of Homo Economicus: behavioral experiments in 15 small-scale societies. Am. Econ. Rev. 91, 73–78 (2001).

    Article  Google Scholar 

  29. Future of Life Institute. Asilomar AI Principles https://futureoflife.org/ai-principles/ (2017).

  30. Haidt, J. The Righteous Mind: Why Good People Are Divided by Politics and Religion (Knopf Doubleday, New York, 2012).

  31. Gastil, J., Braman, D., Kahan, D. & Slovic, P. The cultural orientation of mass political opinion. PS Polit. Sci. Polit. 44, 711–714 (2011).

    Article  Google Scholar 

  32. Nishi, A., Christakis, N. A. & Rand, D. G. Cooperation, decision time, and culture: online experiments with American and Indian participants. PLoS ONE 12, e0171252 (2017).

    Article  Google Scholar 

Download references

Acknowledgements

I.R., E.A., S.D., and R.K. acknowledge support from the Ethics and Governance of Artificial Intelligence Fund. J.-F.B. acknowledges support from the ANR-Labex Institute for Advanced Study in Toulouse.

Author information

Authors and Affiliations

Authors

Contributions

I.R., A.S. and J.-F.B. planned the research. I.R., A.S., J.-F.B., E.A. and S.D. designed the experiment. E.A. and S.D. built the platform and collected the data. E.A., S.D., R.K., J.S. and A.S. analysed the data. E.A., S.D., R.K., J.S., J.H., A.S., J.-F.B., and I.R interpreted the results and wrote the paper.

Corresponding authors

Correspondence to Azim Shariff, Jean-François Bonnefon or Iyad Rahwan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Robustness checks: internal validation of three simplifying assumptions.

Calculated values correspond to values in Fig. 2a (that is, AMCE calculated using conjoint analysis). For example, ‘Sparing Pedestrians [Relation to AV]’ refers to the difference between the probability of sparing pedestrians, and the probability of sparing passengers (attribute name: Relation to AV), aggregated over all other attributes. Error bars represent 95% confidence intervals of the means. AV, autonomous vehicle. a, Validation of assumption 1 (stability and no-carryover effect): potential outcomes remain stable regardless of scenario order. b, Validation of assumption 2 (no profile-order effects): potential outcomes remain stable regardless of left–right positioning of choice options on the screen. c, Validation of assumption 3 (randomization of the profiles): potential outcomes are statistically independent of the profiles. This assumption should be satisfied by design. However, a mismatch between the design and the collected data can happen during data collection. This panel shows that using theoretical proportions (by design) and actual proportions (in collected data) of subgroups results in similar effect estimates. See Supplementary Information for more details.

Extended Data Fig. 2 Robustness checks: external validation of three factors.

Calculated values correspond to values in Fig. 2a (AMCE calculated using conjoint analysis). For example, ‘Sparing Pedestrians [Relation to AV]’ refers to the difference between the probability of sparing pedestrians, and the probability of sparing passengers (attribute name: Relation to AV), aggregated over all other attributes. Error bars represent 95% confidence intervals of the means. a, Validation of textual description (seen versus not seen). By default, respondents see only the visual representation of a scenario. Interpretation of what type of characters they represent (for example, female doctor) may not be obvious. Optionally, respondents can read a textual description of the scenario by clicking on ‘see description’. This panel shows that direction and (except in one case) order of effect estimates remain stable. The magnitude of the effects increases for respondents who read the textual descriptions, which means that the effects reported in Fig. 2a were not overestimated because of visual ambiguity. b, Validation of device used (desktop versus mobile). Direction and order of effect estimates remain stable regardless of whether respondents used desktop or mobile devices when completing the task. c, Validation of data set (all data versus full first-session data versus survey-only data). Direction and order of effect estimates remain stable regardless of whether the data used in analysis are all data, data restricted to only first completed (13-scenario) session by any user, or data restricted to completed sessions after which the demographic survey was taken. First completed session by any user is an interesting subset of the data because respondents had not seen their summary of results yet, and respondents ended up completing the session. Survey-only data are also interesting given that the conclusions about individual variations in the main paper and from Extended Data Fig. 3 and Extended Data Table 1 are drawn from this subset. See Supplementary Information for more details.

Extended Data Fig. 3 Average marginal causal effect (AMCE) of attributes for different subpopulations.

Subpopulations are characterized by respondents’ age (a, older versus younger), gender (b, male versus female), education (c, less versus more educated), income (d, higher versus lower income), political views (e, conservative versus progressive), and religious views (f, not religious versus very religious). Error bars represent 95% confidence intervals of the means. Note that AMCE has a positive value for all considered subpopulations; for example, both male and female respondents indicated a preference for sparing females, but the latter group showed a stronger preference. See Supplementary Information for a detailed description of the cutoffs and the groupings of ordinal categories that were used to define each subpopulation.

Extended Data Fig. 4 Hierarchical cluster of countries based on country-level effect sizes calculated after filtering out responses for which the linguistic description was seen, thus neutralizing any potential effect of language.

The three colours of the dendrogram branches represent three large clusters: Western, Eastern, and Southern. The names of the countries are coloured according to the Inglehart–Welzel Cultural Map 2010–201421. See Supplementary Information for more details. The dendrogram is essentially similar to that shown in Fig. 3a.

Extended Data Fig. 5 Validation of hierarchical cluster of countries.

a, b, We use two internal metrics of validation of three linkage criteria of calculating hierarchical clustering (Ward, Complete and Average) in addition to the K-means algorithm: a, Calinski–Harabasz index; b, silhouette index. The x axis indicates the number of clusters. For both internal metrics, a higher index value indicates a ‘better’ fit of partition to the data. c, d, We use two external metrics of validation of the used hierarchical clustering algorithm (Ward) versus those of random clustering assignment: c, purity; d, maximum matching. The histogram shows the distributions of purity and maximum matching values derived from randomly assigning countries to nine clusters. The red dotted lines indicate purity and maximum matching values computed from the clustering output of the hierarchical clustering algorithm using ACME values. See Supplementary Information for more details.

Extended Data Fig. 6 Demographic distributions of sample of population that completed the survey on Moral Machine (MM) website.

Distributions are based on gender (a), age (b), income (c), and education attributes (d). Most users on Moral Machine are male, went through college, and are in their 20s or 30s. While this indicates that the users of Moral Machine are not a representative sample of the whole population, it is important to note that this sample at least covers broad demographics. See Supplementary Information for more details.

Extended Data Fig. 7 Demographic distributions of US sample of population that completed the survey on Moral Machine website versus US sample of population in American Community Survey (ACS) data set.

ad, Only gender (a), age (b), income (c), and education (d) attributes are available for both data sets. The MM US sample has an over-representation of males and younger individuals compared to the ACS US sample. e, A comparison of effect sizes as calculated for US respondents who took the survey on MM with the use of post-stratification to match the corresponding proportions for the ACS sample. Except for ‘Relation to AV’ (the second smallest effect), the direction and order of all effects are unaffected. See Supplementary Information for more details.

Extended Data Table 1 Regression table showing the individual variations for each of the nine attributes
Extended Data Table 2 Country-level OLS regressions showing the relationships between key ethical preferences and various social, political and economic measures

Supplementary information

Supplementary Information

This file contains Supplementary Text, Supplementary Figures 1-11, Supplementary Tables 1-2 and Supplementary References.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Awad, E., Dsouza, S., Kim, R. et al. The Moral Machine experiment. Nature 563, 59–64 (2018). https://doi.org/10.1038/s41586-018-0637-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-018-0637-6

Keywords

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing