The benefits, risks and bounds of personalizing the alignment of large language models to individuals

Kirk, Hannah Rose; Vidgen, Bertie; Röttger, Paul; Hale, Scott A.

doi:10.1038/s42256-024-00820-y

Perspective
Published: 23 April 2024

The benefits, risks and bounds of personalizing the alignment of large language models to individuals

Hannah Rose Kirk ORCID: orcid.org/0000-0002-7419-5993¹,
Bertie Vidgen¹,
Paul Röttger² &
…
Scott A. Hale¹

Nature Machine Intelligence volume 6, pages 383–392 (2024)Cite this article

1014 Accesses
63 Altmetric
Metrics details

Subjects

Abstract

Large language models (LLMs) undergo ‘alignment’ so that they better reflect human values or preferences, and are safer or more useful. However, alignment is intrinsically difficult because the hundreds of millions of people who now interact with LLMs have different preferences for language and conversational norms, operate under disparate value systems and hold diverse political beliefs. Typically, few developers or researchers dictate alignment norms, risking the exclusion or under-representation of various groups. Personalization is a new frontier in LLM development, whereby models are tailored to individuals. In principle, this could minimize cultural hegemony, enhance usefulness and broaden access. However, unbounded personalization poses risks such as large-scale profiling, privacy infringement, bias reinforcement and exploitation of the vulnerable. Defining the bounds of responsible and socially acceptable personalization is a non-trivial task beset with normative challenges. This article explores ‘personalized alignment’, whereby LLMs adapt to user-specific data, and highlights recent shifts in the LLM ecosystem towards a greater degree of personalization. Our main contribution explores the potential impact of personalized LLMs via a taxonomy of risks and benefits for individuals and society at large. We lastly discuss a key open question: what are appropriate bounds of personalization and who decides? Answering this normative question enables users to benefit from personalized alignment while safeguarding against harmful impacts for individuals and society.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Hierarchical bounds on personalized alignment.**

Using large language models in psychology

Article 13 October 2023

The potential of generative AI for personalized persuasion at scale

Article Open access 26 February 2024

The future landscape of large language models in medicine

Article Open access 10 October 2023

References

Memory and new controls for ChatGPT. OpenAI https://openai.com/blog/memory-and-new-controls-for-chatgpt (2024).
Russell, S. J. Human Compatible: Artificial Intelligence and the Problem of Control (Allen Lane, 2019).
Iason, G. & Ghazavi, V. in Oxford Handbook of Digital Ethics (ed. Véliz, C.) Ch. 18 (Oxford Univ. Press, 2023
Tong, A. Exclusive: ChatGPT traffic slips again for third month in a row. Reuters https://www.reuters.com/technology/chatgpt-traffic-slips-again-third-month-row-2023-09-07/ (2023).
Kasirzadeh, A. & Gabriel, I. In conversation with artificial intelligence: aligning language models with human values. Philos. Technol. 36, 27 (2023).
Article Google Scholar
Kirk, H., Bean, A., Vidgen, B., Rottger, P. & Hale, S. The past, present and better future of feedback learning in large language models for subjective human preferences and values. In Proc. 2023 Conference on Empirical Methods in Natural Language Processing (eds Bouamor, H. et al.) 2409–2430 (Association for Computational Linguistics, 2023); https://doi.org/10.18653/v1/2023.emnlp-main.148
Kirk, H. R., Vidgen, B., Röttger, P., & Hale, S. A. The empty signifier problem: towards clearer paradigms for operationalising ‘alignment’ in large language models. Preprint at http://arxiv.org/abs/2310.02457 (2023).
Sorensen, T. et al. A roadmap to pluralistic alignment. Preprint at http://arxiv.org/abs/2402.05070 (2024).
Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems 30 Vol. 30 (eds Guyon, I. et al) (NIPS, 2017).
Jiang, H., Beeferman, D., Roy, B. & Roy, D. CommunityLM: probing partisan worldviews from language models. In Proc. 29th International Conference on Computational Linguistics (eds Calzolari, N. et al.) 6818–6826 (International Committee on Computational Linguistics, 2022).
Greene, T. & Shmueli, G. Beyond our behavior: the GDPR and humanistic personalization. Preprint at http://arxiv.org/abs/2008.13404 (2020).
Stiennon, N. et al. Learning to summarize with human feedback. In Advances in Neural Information Processing Systems 33 (eds Larochelle, H. et al.) 3008–3021 (NIPS, 2020).
Pariser, E. The Filter Bubble: How the New Personalized Web is Changing What We Read and How We Think (Penguin, 2011).
Möller, J. in The Routledge Companion to Media Disinformation and Populism (eds Tumber, H. & Waisbord, S.) 92–100 (Routledge, 2021).
Weidinger, L. et al. Taxonomy of risks posed by language models. In 2022 ACM Conference on Fairness, Accountability, and Transparency 214–229 (Association for Computing Machinery, 2022); https://doi.org/10.1145/3531146.3533088
Bommasani, R. et al. On the opportunities and risks of foundation models. Preprint at http://arxiv.org/abs/2108.07258 (2022).
Bender, E. M., Gebru, T., McMillan-Major, A. & Shmitchell, S. On the dangers of stochastic parrots: can language models be too big? In Proc. 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’21) 610–623 (Association for Computing Machinery, 2021); https://doi.org/10.1145/3442188.3445922
Shelby, R. et al. Sociotechnical harms of algorithmic systems: scoping a taxonomy for harm reduction. In Proc. 2023 AAAI/ACM Conference on AI, Ethics, and Society 723–741 (Association for Computing Machinery, 2023).
Gillespie, T. Do not recommend? Reduction as a form of content moderation. Soc. Media Soc. 8, 20563051221117552 (2022).
Google Scholar
Milano, S., Taddeo, M. & Floridi, L. Recommender systems and their ethical challenges. AI Soc. 35, 957–967 (2020).
Bhadani, S. Biases in recommendation system. In Proc. 15th ACM Conference on Recommender Systems (RecSys ’21) 855–859 (Association for Computing Machinery, 2021); https://doi.org/10.1145/3460231.3473897
Zhang, S. et al. Personalizing dialogue agents: I have a dog, do you have pets too? In Proc. 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (eds Gurevych, I. & Miyao, Y.) 2204–2213 (Association for Computing Machinery, 2018); https://doi.org/10.18653/v1/P18-1205
Dudy, S., Bedrick, S. & Webber, B. Refocusing on relevance: personalization in NLG. In Proc. 2021 Conference on Empirical Methods in Natural Language Processing (eds Moens, M.-F. et al.) 5190–5202 (Association for Computational Linguistics, 2021); https://doi.org/10.18653/v1/2021.emnlp-main.421
Salemi, A., Mysore, S., Bendersky, M. & Zamani, H. LaMP: when large language models meet personalization. Preprint at http://arxiv.org/abs/2304.11406 (2023).
Jang, J. et al. Personalized soups: personalized large language model alignment via post-hoc parameter merging. Preprint at http://arxiv.org/abs/2310.11564 (2023).
Li, X., Lipton, Z. C. & Leqi, L. Personalized language modeling from personalized human feedback. Preprint at http://arxiv.org/abs/2402.05133 (2024).
Sellman, M. ChatGPT will always have bias, says OpenAI boss. The Times https://www.thetimes.co.uk/article/chatgpt-biased-openai-sam-altman-rightwinggpt-2023-9rnc6l5jn (May 2023).
GPT-3.5 Turbo fine-tuning and API updates. OpenAI https://openai.com/blog/gpt-3-5-turbo-fine-tuning-and-api-updates (2023).
New models and developer products announced at DevDay. OpenAI https://openai.com/blog/new-models-and-developer-products-announced-at-devday (2023).
JushBJJ. Mr. Ranedeer: your personalized AI tutor! GitHub https://github.com/JushBJJ/Mr.-Ranedeer-AI-Tutor (2023).
AutoGPT. Significant-Gravitas/Auto-GPT: an experimental open-source attempt to make GPT-4 fully autonomous. GitHub https://github.com/Significant-Gravitas/Auto-GPT (2023).
Introducing ChatGPT. OpenAI https://openai.com/blog/chatgpt (2022).
Curry, D. ChatGPT revenue and usage statistics. Business of Apps https://www.businessofapps.com/data/chatgpt-statistics/ (2023).
Van Noorden, R. ChatGPT-like AIs are coming to major science search engines. Nature 620, 258–258 (2023).
Spataro, J. Introducing Microsoft 365 Copilot—your copilot for work. Microsoft Blog https://blogs.microsoft.com/blog/2023/03/16/introducing-microsoft-365-copilot-your-copilot-for-work/ (2023).
Ouyang, L. et al. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems 35 (eds Koyejo, S. et al.) 27730–27744 (NIPS, 2022.).
Nakano, R. et al. WebGPT: browser-assisted question-answering with human feedback. Preprint at http://arxiv.org/abs/2112.09332v3 (2021).
Bai, Y. et al. Training a helpful and harmless assistant with reinforcement learning from human feedback. Preprint at http://arxiv.org/abs/2204.05862 (2022).
Ziegler, D. M. et al. Fine-tuning language models from human preferences. Preprint at http://arxiv.org/abs/1909.08593v2 (2019).
Thoppilan, R. et al. LaMDA: language models for dialog applications. Preprint at http://arxiv.org/abs/2201.08239 (2022).
Glaese, A. et al. Improving alignment of dialogue agents via targeted human judgements. Preprint at http://arxiv.org/abs/2209.14375v1 (2022).
Perez, E. et al. Discovering language model behaviors with model-written evaluations. In Findings of the Association for Computational Linguistics: ACL 2023 (eds Rogers, A. et al.) 13387–13434 (Association for Computational Linguistics, 2023).
Casper, S. et al. Open problems and fundamental limitations of reinforcement learning from human feedback. Preprint at http://arxiv.org/abs/2307.15217 (2023).
Gartenberg, C. What is a long context window? Google Blog https://blog.google/technology/ai/long-context-window-ai-models/ (2024).
Dettmers, T., Pagnoni, A., Holtzman, A. & Zettlemoyer, L. Qlora: efficient finetuning of quantized LLMs. In Advances in Neural Information Processing Systems 36 (NeurIPS 2023) (eds Oh, A. et al.) (NIPS, 2024).
Wang, Y. et al. Self-instruct: aligning language models with self-generated instructions. In Proc. 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (eds Rogers, A. et al.) 13484–13508 (Association for Computational Linguistics, 2023).
Taori, R. et al. Alpaca: a strong, replicable instruction-following model. Stanford Center for Research on Foundation Models https://crfm.stanford.edu/2023/03/13/alpaca.html (2023).
Gudibande, A. et al. The false promise of imitating proprietary LLMs. Preprint at http://arxiv.org/abs/2305.15717 (2023).
Gibson, J. J. The Ecological Approach to Visual Perception: Classic Edition (Psychology Press, 1979).
Page, L., Brin, S., Motwani, R. & Winograd, T. The PageRank Citation Ranking: Bringing Order to the Web (Stanford Digital Library Technologies Project, 1999); http://ilpubs.stanford.edu:8090/422/?doi=10.1.1.31.1768
Aula, A. & Nordhausen, K. Modeling successful performance in web searching. J. Am. Soc. Inf. Sci. Technol. 57, 1678–1693 (2006).
Article Google Scholar
Dou, Z., Song, R., Wen, Ji-Rong. & Yuan, X. Evaluating the effectiveness of personalized web search. IEEE Trans. Knowl. Data Eng. 21, 1178–1190 (2009).
Article Google Scholar
Kochmar, E. et al. Automated personalized feedback improves learning gains in an intelligent tutoring system. In Artificial Intelligence in Education. AIED 2020. Lecture Notes in Computer Science Vol. 12164 (eds Bittencourt, I. et al.) 140–146 (Springer, 2020); https://doi.org/10.1007/978-3-030-52240-7_26
Barua, PrabalDatta. et al. Artificial intelligence enabled personalised assistive tools to enhance education of children with neurodevelopmental disorders–a review. Int. J. Environ. Res. Pub. Health 19, 1192 (2022).
Article Google Scholar
Acharya, S. et al. Towards generating personalized hospitalization summaries. In Proc. 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop 74–82 (Association for Computational Linguistics, 2018); https://doi.org/10.18653/v1/N18-4011
Bakker, M. et al. Fine-tuning language models to find agreement among humans with diverse preferences. In Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (eds Koyejo, S. et al.) 38176–38189 (2022).
Oulasvirta, A. & Blom, J. Motivations in personalisation behaviour. Interact. Comput. 20, 1–16 (2008).
Article Google Scholar
Stasi, MariaLuisa. Social media platforms and content exposure: how to restore users’ control. Compet. Regul. Netw. Ind. 20, 86–110 (2019).
Google Scholar
Burrell, J., Kahn, Z., Jonas, A. & Griffin, D. When users control the algorithms: values expressed in practices on Twitter. In Proc. ACM on Human–Computer Interaction Vol. 3 (eds Lampinen, A. et al.) 138:1–138:20 (Association for Computing Machinery, 2019).
Duplessis, G. D., Clavel, C. & Landragin, F. Automatic measures to characterise verbal alignment in human–agent interaction. In Proc. 18th Annual SIGdial Meeting on Discourse and Dialogue (eds Jokinen, K. et al.) 71–81 (Association for Computational Linguistics, 2017); https://doi.org/10.18653/v1/W17-5510
Liu-Thompkins, Y., Okazaki, S. & Li, H. Artificial empathy in marketing interactions: bridging the human-AI gap in affective and social customer experience. J. Acad. Mark. Sci. 50, 1198–1218 (2022).
Article Google Scholar
Zhou, R., Deshmukh, S., Greer, J. & Lee, C. NaRLE: natural language models using reinforcement learning with emotion feedback. Preprint at http://arxiv.org/abs/2110.02148v1 (2021).
Pelau, C., Dabija, Dan-Cristian. & Ene, I. What makes an AI device human-like? The role of interaction quality, empathy and perceived psychological anthropomorphic characteristics in the acceptance of artificial intelligence in the service industry. Comput. Hum. Behav. 122, 106855 (2021).
Article Google Scholar
Inkster, B., Sarda, S. & Subramanian, V. An empathy-driven, conversational artificial intelligence agent (Wysa) for digital mental well-being: real-world data evaluation mixed-methods study. JMIR mHealth uHealth 6, e12106 (2018).
Article Google Scholar
Rust, R. T. & Huang, M.-H. The Feeling Economy: How Artificial Intelligence is Creating the Era of Empathy (Palgrave Macmillan, 2021); https://doi.org/10.1007/978-3-030-52977-2
Reimer, T. & Benkenstein, M. Altruistic eWOM marketing: more than an alternative to monetary incentives. J. Retail. Consum. Serv. 31, 323–333 (2016).
Article Google Scholar
Gillespie, T. Custodians of the Internet: Platforms, Content Moderation, and the Hidden Decisions That Shape Social Media (Yale Univ. Press, 2018).
Birhane, A. et al. Power to the people? Opportunities and challenges for participatory AI. In Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO ’22) 1–8 (Association for Computing Machinery, 2022); https://doi.org/10.1145/3551624.3555290
Ostovar, S., Bagheri, R., Griffiths, M. D. & Mohd Hashima, I. H. Internet addiction and maladaptive schemas: the potential role of disconnection/rejection and impaired autonomy/performance. Clin. Psychol. Psychother. 28, 1509–1524 (2021).
Article Google Scholar
Chou, C., Condron, L. & Belland, J. C. A review of the research on Internet addiction. Educ. Psychol. Rev. 17, 363–388 (2005).
Article Google Scholar
Lozano-Blasco, R., Robres, AlbertoQuilez. & Sánchez, AlbertoSoto. Internet addiction in young adults: a meta-analysis and systematic review. Comput. Hum. Behav. 130, 107201 (2022).
Article Google Scholar
Abbass, H. A. Social integration of artificial intelligence: functions, automation allocation logic and human-autonomy trust. Cogn. Comput. 11, 159–171 (2019).
Article Google Scholar
Lin, A. Y., Kuehl, K., Schöning, J. & Hecht, B. Understanding ‘death by GPS’: a systematic study of catastrophic incidents associated with personal navigation technologies. In Proc. 2017 CHI Conference on Human Factors in Computing Systems (CHI ’17) 1154–1166 (Association for Computing Machinery, 2017); https://doi.org/10.1145/3025453.3025737
Howard, J. Artificial intelligence: implications for the future of work. Am. J. Ind. Med. 62, 917–926 (2019).
Article Google Scholar
Passi, S. & Vorvoreanu, M. Overreliance on AI: Literature Review (Microsoft, 2021); https://www.microsoft.com/en-us/research/uploads/prod/2022/06/Aether-Overreliance-on-AI-Review-Final-6.21.22.pdf
Madhok, D. Asia’s richest man Gautam Adani is addicted to ChatGPT. CNN https://www.cnn.com/2023/01/23/tech/gautam-adani-chatgpt-india-hnk-intl/index.html (2023).
Cook, J. Why ChatGPT is making us less intelligent: 6 key reasons. Forbes https://www.forbes.com/sites/jodiecook/2023/07/27/why-chatgpt-could-be-making-us-less-intelligent-6-key-reasons/(2023).
Baron, N. S. Even kids are worried ChatGPT will make them lazy plagiarists, says a linguist who studies tech’s effect on reading, writing and thinking. Fortune https://fortune.com/2023/01/19/what-is-chatgpt-ai-effect-cheating-plagiarism-laziness-education-kids-students/ (2023).
Schnabel, T., Swaminathan, A., Singh, A., Chandak, N. & Joachims, T. Recommendations as treatments: debiasing learning and evaluation. Proc. 33rd International Conference on Machine Learning Vol. 48 (eds Balcan, M. F. & Weinberger, K. Q.) 1670–1679 (PMLR, 2016); http://proceedings.mlr.press/v48/schnabel16.pdf
Fletcher, A., Ormosi, P. L. & Savani, R. Recommender systems and supplier competition on platforms. Preprint at https://doi.org/10.2139/ssrn.4036813 (2022).
Hesmondhalgh, D., Campos Valverde, R., Kaye, D. B. V. & Li, Z. The impact of algorithmically driven recommendation systems on music consumption and production: a literature review. Preprint at https://ssrn.com/abstract=4365916 (2023).
Powers, M. & Benson, R. Is the Internet homogenizing or diversifying the news? External pluralism in the U.S., Danish, and French press. Int. J. Press Polit. 19, 246–265 (2014).
Article Google Scholar
Kaakinen, M., Sirola, A., Savolainen, I. & Oksanen, A. Shared identity and shared information in social media: development and validation of the identity bubble reinforcement scale. Media Psychol. 23, 25–51 (2020).
Article Google Scholar
Kuru, O., Pasek, J. & Traugott, M. W. Motivated reasoning in the perceived credibility of public opinion polls. Public Opin. Q. 81, 422–446 (2017).
Article Google Scholar
Shah, C. & Bender, E. M. Situating search. In ACM SIGIR Conference on Human Information Interaction and Retrieval 221–232 (Association for Computing Machinery, 2022); https://doi.org/10.1145/3498366.3505816
Svensson, J. & Poveda Guillen, O. What is data and what can it be used for?: Key questions in the age of burgeoning data-essentialism. J. Digit. Soc. Res. 2, 65–83 (2020).
Article Google Scholar
Welch, C., Gu, C., Kummerfeld, J. K., Perez, V.-R. & Mihalcea, R. Leveraging similar users for personalized language modeling with limited data. In Proc. 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (eds Muresan, S. et al.) 1742–1752 (Association for Computational Linguistics, 2022); https://doi.org/10.18653/v1/2022.acl-long.122
van der Hof, S. & Prins, C. in Profiling the European Citizen: Cross-Disciplinary Perspectives (eds Hildebrandt, M. & Gutwirth, S.) 111–127 (Springer, 2008);. https://doi.org/10.1007/978-1-4020-6914-7_6
Bastos, M. From global village to identity tribes: context collapse and the darkest timeline. Media Commun. 9, 50–58 (2021).
Article Google Scholar
Siapera, E. Multiculturalism online: the Internet and the dilemmas of multicultural politics. Eur. J. Cult. Stud. 9, 5–24 (2006).
Article Google Scholar
Floridi, L. The informational nature of personal identity. Minds Mach. 21, 549–566 (2011).
Article Google Scholar
Waytz, A., Epley, N. & Cacioppo, J. T. Social cognition unbound: insights into anthropomorphism and dehumanization. Curr. Dir. Psychol. Sci. 19, 58–62 (2010).
Article Google Scholar
Riek, L. D., Rabinowitch, T.-C., Chakrabarti, B. & Robinson, P. How anthropomorphism affects empathy toward robots. In Proc. 4th ACM/IEEE International Conference on Human Robot Interaction (HRI ’09) 245–246 (Association for Computing Machinery, 2009); https://doi.org/10.1145/1514095.1514158
Prescott, T. J. & Robillard, J. M. Are friends electric? The benefits and risks of human-robot relationships. iScience 24, 101993 (2021).
Article Google Scholar
Burkett, C. I call Alexa to the stand: the privacy implications of anthropomorphizing virtual assistants accompanying smart-home technology notes. Vanderbilt J. Entertain. Technol. Law 20, 1181–1218 (2017).
Google Scholar
Zehnder, E., Dinet, J. & Charpillet, F. Anthropomorphism, privacy and security concerns: preliminary work. In ERGO’IA 2021 hal-03365472 (HAL open science, 2021); https://hal.archives-ouvertes.fr/hal-03365472
Kronemann, B., Kizgin, H., Rana, N. & Dwivedi, Y. K. How AI encourages consumers to share their secrets? The role of anthropomorphism, personalisation, and privacy concerns and avenues for future research. Span. J. Mark. https://doi.org/10.1108/SJME-10-2022-0213 (2023).
Chow, A. R. AI-human romances are flourishing—and this is just the beginning. Time https://time.com/6257790/ai-chatbots-love/ (21 February 2023).
Agirre, E. & Soroa, A. Personalizing PageRank for word sense disambiguation. In Proc. 12th Conference of the European Chapter of the ACL (EACL 2009) (eds Lascarides, A. et al.) 33–41 (Association for Computational Linguistics, 2009).
Wachter, S. Normative challenges of identification in the Internet of Things: privacy, profiling, discrimination, and the GDPR. Comput. Law Secur. Rev. 34, 436–449 (2018).
Article Google Scholar
Varnali, K. Online behavioral advertising: an integrative review. J. Mark. Commun. 27, 93–114 (2021).
Article Google Scholar
Susser, D. & Grimaldi, V. Measuring automated influence: between empirical evidence and ethical values. In Proc. 2021 AAAI/ACM Conference on AI, Ethics, and Society (AIES ’21) 242–253 (Association for Computing Machinery, 2021); https://doi.org/10.1145/3461702.3462532
Guo, X., Sun, Y., Yuan, J., Yan, Z. & Wang, N. Privacy–personalization paradox in adoption of mobile health service: the mediating role of trust. In PACIS 2012 Proceedings 27 (AIS eLibrary, 2012); http://aisel.aisnet.org/pacis2012/27
Armstrong, S. Data, data everywhere: the challenges of personalised medicine. Br. Med. J. 359, j4546 (2017).
Article Google Scholar
European Parliament. GDPR: 2016/679 of the European Parliament and of the Council of 27 April 2016 on the Protection of Natural Persons with Regard to the Processing of Personal Data and on the Free Movement of Such Data, and Repealing Directive 95/46/EC (General Data Protection Regulation). EUR-Lex https://eur-lex.europa.eu/eli/reg/2016/679/oj/eng (2016).
Cremers, A. H. M. & Neerincx, M. A. in User-Centered Interaction Paradigms for Universal Access in the Information Society (eds Stary, C. & Stephanidis, C.) 119–124 (Springer, 2004); https://doi.org/10.1007/978-3-540-30111-0_9
Knox, J., Wang, Y. & Gallagher, M. in Artificial Intelligence and Inclusive Education: Speculative Futures and Emerging Practices (eds Knox, J. et al.) 1–13 (Springer, 2019).
Lizarondo, L., Kumar, S., Hyde, L. & Skidmore, D. Allied health assistants and what they do: a systematic review of the literature. J. Multidiscip. Healthc. 3, 143–153 (2010).
Google Scholar
Gabriel, I. Artificial intelligence, values and alignment. Minds Mach. 30, 411–437 (2020).
Article Google Scholar
Marikyan, D., Papagiannidis, S., Rana, O. F., Ranjan, R. & Morgan, G. ‘Alexa, let’s talk about my productivity’: the impact of digital assistants on work productivity. J. Bus. Res. 142, 572–584 (2022).
Article Google Scholar
Lane, M. & Saint-Martin, A. The Impact of Artificial Intelligence on the Labour Market: What Do We Know So Far? (OECD, 2021); https://www.oecd-ilibrary.org/social-issues-migration-health/the-impact-of-artificial-intelligence-on-the-labour-market_7c895724-en
Crafts, N. Artificial intelligence as a general-purpose technology: an historical perspective. Oxf. Rev. Econ. Policy 37, 521–536 (2021).
Article Google Scholar
Eloundou, T., Manning, S., Mishkin, P. & Rock, D. GPTs are GPTs: an early look at the labor market impact potential of large language models. Preprint at https://arxiv.org/abs/2303.10130 (2023).
Suleyman, M. The Coming Wave 1st edn (Crown, 2023).
Zytko, D., Wisniewski, P. J., Guha, S., Baumer, E. P. S. & Lee, M. K. Participatory design of AI systems: opportunities and challenges across diverse users, relationships, and application domains. In Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (CHI EA ’22) (eds Barbosa, S. et al.) 1–4 (Association for Computing Machinery, 2022); https://doi.org/10.1145/3491101.3516506
Kormilitzin, A., Tomasev, N., McKee, K. R. & Joyce, D. W. A participatory initiative to include LGBT+ voices in AI for mental health. Nat. Med. 29, 10–11 (2023).
Article Google Scholar
Cullen, R. Addressing the digital divide. Online Inf. Rev. 25, 311–320 (2001).
Article Google Scholar
Couldry, N. in Digital Dynamics: Engagements and Connections (eds Murdock, G. & Golding, P.) 105–124 (Hampton Press, 2010).
Segev, E. Google and the Digital Divide: The Bias of Online Knowledge (Elsevier, 2010).
Lutz, C. Digital inequalities in the age of artificial intelligence and big data. Hum. Behav. Emerg. Technol. 1, 141–148 (2019).
Article Google Scholar
Lythreatis, S., Singh, SanjayKumar. & El-Kassar, Abdul-Nasser. The digital divide: a review and future research agenda. Technol. Forecast. Soc. Change 175, 121359 (2022).
Article Google Scholar
Jain, V. et al. Racial and geographic disparities in Internet use in the U.S. among patients with hypertension or diabetes: implications for telehealth in the era of COVID-19. Diabetes Care 44, e15–e17 (2020).
Article Google Scholar
Morey, O. T. Digital disparities. J. Consum. Health Internet 11, 23–41 (2007).
Article Google Scholar
Pariser, E. The Filter Bubble: What The Internet Is Hiding From You (Penguin, 2011).
Zollo, F. et al. Debunking in a world of tribes. PLoS ONE 12, e0181821 (2017).
Article Google Scholar
Cinelli, M., De Francisci Morales, G., Galeazzi, A., Quattrociocchi, W. & Starnini, M. The echo chamber effect on social media. Proc. Natl Acad. Sci. USA 118, e2023301118 (2021).
Article Google Scholar
Dunaway, J. in The Routledge Companion to Media Disinformation and Populism (eds Tumber, H. & Waisbord, S.) 131–141 (Routledge, 2021).
Bakshy, E., Messing, S. & Adamic, L. A. Exposure to ideologically diverse news and opinion on Facebook. Science 348, 1130–1132 (2015).
Article MathSciNet Google Scholar
Persily, N. & Tucker, J. A. Social Media and Democracy: The State of the Field, Prospects for Reform (Cambridge Univ. Press, 2020).
Harsin, J. Regimes of posttruth, postpolitics, and attention economies. Commun. Cult. Crit. 8, 327–333 (2015).
Article Google Scholar
Allcott, H. & Gentzkow, M. Social media and fake news in the 2016 election. J. Econ. Perspect. 31, 211–236 (2017).
Article Google Scholar
Halavais, A. Search Engine Society (John Wiley & Sons, 2017).
Benjamin, R. Race After Technology: Abolitionist Tools for the New Jim Code (John Wiley & Sons, 2019).
O’Donnell, C. & Shor, E. ‘This is a political movement, friend’: Why ‘incels’ support violence. Br. J. Sociol. 73, 336–351 (2022).
Article Google Scholar
Regehr, K. In(cel)doctrination: how technologically facilitated misogyny moves violence off screens and on to streets. New Media Soc. 24, 138–155 (2022).
Article Google Scholar
Törnberg, P. & Törnberg, A. Inside a White Power echo chamber: qhy fringe digital spaces are polarizing politics. New Media Soc. https://doi.org/10.1177/14614448221122915 (2022).
Gault, M. AI trained on 4chan becomes ‘hate speech machine’. Vice https://www.vice.com/en/article/7k8zwx/ai-trained-on-4chan-becomes-hate-speech-machine (2022).
Tiwari, A. et al. Persona or context? Towards building context adaptive personalized persuasive virtual sales assistant. In Proc. 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (eds He, Y. et al.) 1035–1047 (Association for Computational Linguistics, 2022).
Wang, X. et al. Persuasion for good: towards a personalized persuasive dialogue system for social good. In Proc. 57th Annual Meeting Of The Association For Computational Linguistics (eds Korhonen, A. et al.) 5635–5649 (Association for Computational Linguistics, 2019); https://doi.org/10.18653/v1/P19-1566
Koto, F., Lau, J. H. & Baldwin, T. Can pretrained language models generate persuasive, faithful, and informative ad text for product descriptions? In Proc. Fifth Workshop on e-Commerce and NLP (ECNLP 5) (eds Malmasi, S. et al.) 234–243 (Association for Computational Linguistics, 2022); https://doi.org/10.18653/v1/2022.ecnlp-1.27
Susser, D. Invisible influence: artificial intelligence and the ethics of adaptive choice architectures. In Proc. 2019 AAAI/ACM Conference on AI, Ethics, and Society (AIES ’19) 403–408 (Association for Computing Machinery, 2019); https://doi.org/10.1145/3306618.3314286
Calo, R. Digital market manipulation. Preprint at https://doi.org/10.2139/ssrn.2309703 (2013).
Susser, D., Roessler, B. & Nissenbaum, H. Online manipulation: hidden influences in a digital world. Preprint at https://doi.org/10.2139/ssrn.3306006 (2018).
Nadler, A., Crain, M. & Donovan, J. Weaponizing the Digital Influence Machine: The Political Perils of Online Ad Tech. (Data &Society Research Institute, 2018); https://datasociety.net/wp-content/uploads/2018/10/DS_Digital_Influence_Machine.pdf
Frank, M. R. et al. Toward understanding the impact of artificial intelligence on labor. Proc. Natl Acad. Sci. USA 116, 6531–6539 (2019).
Article Google Scholar
Lordan, G. & Neumark, D. People versus machines: the impact of minimum wages on automatable jobs. Labour Econ. 52, 40–53 (2018).
Article Google Scholar
Downey, M. Partial automation and the technology-enabled deskilling of routine jobs. Labour Econ. 69, 101973 (2021).
Article Google Scholar
Haider, J., Rödl, M. & Joosse, S. Algorithmically embodied emissions: the environmental harm of everyday life information in digital culture. Preprint at https://papers.ssrn.com/abstract=4112942 (2022).
Herrman, J. What does it mean that Elon Musk’s new AI chatbot is ‘anti-woke’? New York Magazine https://nymag.com/intelligencer/2023/11/elon-musks-grok-ai-bot-is-anti-woke-what-does-that-mean.html (2023).
Kant, I. Immanuel Kant: Groundwork of the Metaphysics of Morals: A German-English Edition 1st edn (Cambridge Univ. Press, 2011); https://doi.org/10.1017/CBO9780511973741
Byrd, B. S. & Hruschka, J. Kant’s Doctrine of Right: A Commentary (Cambridge Univ. Press, 2010); https://doi.org/10.1017/CBO9780511712050
Rawls, J. A Theory of Justice: Original Edition (Harvard Univ. Press, 1971); https://doi.org/10.2307/j.ctvjf9z6v
Mill, J. S. On Liberty 1st edn (Cambridge Univ. Press, 2011); https://doi.org/10.1017/CBO9781139149785
Sandel, M. J. Liberalism and the Limits of Justice 2nd edn (Cambridge Univ. Press, 1998); https://doi.org/10.1017/CBO9780511810152
Habermas, J. The Theory Of Communicative Action, Volume 2: Lifeworld and System: A Critique of Functionalist Reason (transl. MacCarthy, M.) (Beacon, 2005).
Clarke, R. Regulatory alternatives for AI. Comput. Law Secur. Rev. 35, 398–409 (2019).
Article Google Scholar
Wendehorst, C. Strict liability for AI and other emerging technologies. J. Eur. Tort Law 11, 150–180 (2020).
Article Google Scholar
Gillespie, T. et al. Expanding the debate about content moderation: scholarly research agendas for the coming policy debates. Internet Policy Rev. 9, https://policyreview.info/articles/analysis/expanding-debate-about-content-moderation-scholarly-research-agendas-coming-policy (2020).
Satz, D. Why Some Things Should Not Be for Sale: The Moral Limits of Markets (Oxford Univ. Press, 2010); https://doi.org/10.1093/acprof:oso/9780195311594.001.0001
Hosein, I., Tsiavos, P. & Whitley, E. A. Regulating architecture and architectures of regulation: contributions from information systems. Int. Rev. Law Comput. Technol. 17, 85–97 (2003).
Article Google Scholar
Lessig, L. Code and Other Laws of Cyberspace (Basic Books, 1999).

Download references

Acknowledgements

H.R.K.’s PhD is supported by the Economic and Social Research Council grant ES/P000649/1. P.R. is supported by a MUR FARE 2020 initiative under grant agreement Prot. R20YSMBZ8S (INDOMITA). We thank A. Bean for input and assistance with the literature review.

Author information

Authors and Affiliations

Oxford Internet Institute, University of Oxford, Oxford, UK
Hannah Rose Kirk, Bertie Vidgen & Scott A. Hale
Bocconi University, Milan, Italy
Paul Röttger

Authors

Hannah Rose Kirk
View author publications
You can also search for this author in PubMed Google Scholar
Bertie Vidgen
View author publications
You can also search for this author in PubMed Google Scholar
Paul Röttger
View author publications
You can also search for this author in PubMed Google Scholar
Scott A. Hale
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

H.R.K., B.V. and S.A.H. initially conceived of the paper and taxonomy. H.R.K. and B.V. wrote the paper. All authors assisted with iterations and edited and reviewed the paper.

Corresponding author

Correspondence to Hannah Rose Kirk.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Travis Greene and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kirk, H.R., Vidgen, B., Röttger, P. et al. The benefits, risks and bounds of personalizing the alignment of large language models to individuals. Nat Mach Intell 6, 383–392 (2024). https://doi.org/10.1038/s42256-024-00820-y

Download citation

Received: 26 May 2023
Accepted: 05 March 2024
Published: 23 April 2024
Issue Date: April 2024
DOI: https://doi.org/10.1038/s42256-024-00820-y