Abstract
Large language models offer novel opportunities to seek digital medical advice. While previous research primarily addressed the performance of such artificial intelligence (AI)-based tools, public perception of these advancements received little attention. In two preregistered studies (n = 2,280), we presented participants with scenarios of patients obtaining medical advice. All participants received identical information, but we manipulated the putative source of this advice (‘AI’, ‘human physician’, ‘human + AI’). ‘AI’- and ‘human + AI’-labeled advice was evaluated as significantly less reliable and less empathetic compared with ‘human’-labeled advice. Moreover, participants indicated lower willingness to follow the advice when AI was believed to be involved in advice generation. Our findings point toward an anti-AI bias when receiving digital medical advice, even when AI is supposedly supervised by physicians. Given the tremendous potential of AI for medicine, elucidating ways to counteract this bias should be an important objective of future research.
Similar content being viewed by others
Main
Artificial intelligence (AI) holds enormous potential for the medical domain, for instance in analyzing medical images1 or detecting drug interactions2. Recent developments in the field of AI-based large language models (LLMs) have now given rise to numerous additional applications in healthcare. One use case that is becoming increasingly relevant from a public perspective is the use of LLMs when seeking medical advice3. To this end, popular LLM applications such as OpenAI’s ChatGPT offer low-threshold access to medical information, seemingly without the need to consult specialist literature or professional physicians. While there are considerable concerns regarding the use of LLMs in healthcare4, earlier research indicates that ChatGPT 4.0 already achieves levels of diagnostic accuracy on a par with human physicians5. In another study, physicians (unaware of who created the advice) even rated LLM-generated responses to medical queries as superior in quality and more empathetic than answers generated by human physicians6.
AI-generated medical advice, thus, is perceived as high quality as long as the AI authorship is undisclosed. Yet, research from various domains indicates reservations against such content as soon as the AI authorship becomes apparent (algorithm aversion7). While similar effects have been observed for the use of AI-based tools in the medical field8, little is known about the perception of novel LLM applications. Moreover, previous research in this field often relied on relatively small samples, lacked an experimental study design (for example, refs. 9,10) or focused solely on the physician’s perspective11. However, from the public’s point of view, not only the objective level of competence but also the subjective perception of the treating physician has a substantial influence on health-promoting behavior, treatment satisfaction and treatment outcome12. Similarly, when seeking medical advice in digital settings, not only technical performance but also the public’s perception of a new tool might prove decisive for its further dissemination and acceptance.
Within our work, we thus aim to explore the public’s perspective on LLM-generated medical advice in a controlled, experimental setting. Therefore, we conducted two preregistered experiments with large samples (study 1: n = 1,050 across various nationalities; study 2: n = 1,230, representative of the UK population in terms of age, gender and ethnicity). Within both studies, we investigated how labeling identical medical advice as generated either by a human physician or by an AI-supported chatbot affects how this information is perceived in terms of reliability, comprehensibility and empathy. As it is expected that AI will not replace but rather support human competencies in the future13, we further extend previous research by including a third group in which the information was labeled as generated by a human physician in collaboration with AI. In study 2, we additionally measured the individual willingness to follow the provided advice. Moreover, we assessed participants’ interest in testing the tool, which supposedly generated the previously encountered medical information by offering the opportunity to save a (fictious) link to a corresponding platform. Thus, our research abstracts from potential differences in the quality of AI- versus human-generated medical information. Instead, we focus on illuminating potential biases toward novel LLM-based tools as sources of medical advice from the public’s perspective.
Figure 1 shows average ratings for each dimension (empathy, reliability, comprehensibility) and author label (‘human’, ‘AI’, ‘human + AI’) in study 1. There was a significant main effect of the author label on empathy ratings, test statistics of corresponding one-way analysis of variance F(2, 1,047) = 7.98, P < 0.001, partial eta squared ηp2 = 0.02. That is, the ‘human’ advice was perceived as significantly more empathic than ‘AI’ advice, test statistics of corresponding two-sample t-test t(698) = 3.58, P < 0.001, Cohen’s d = 0.27 95% confidence interval (CI) (0.12, 0.42), and ‘human + AI’ advice, t(698) = 3.44, P = 0.001, d = 0.26 95% CI (0.11, 0.41). There was no difference of empathy ratings between the ‘AI’ and ‘human + AI’ condition, t < 1. Reliability ratings differed significantly between different author labels, F(2, 1,047) = 9.68, P < 0.001, ηp2 = 0.02. ‘Human’ advice was rated as significantly more reliable than ‘AI’ advice, t(698) = 3.72, P < 0.001, d = 0.28 95% CI (0.13, 0.43), and ‘human + AI’ advice, t(698) = 3.90, P < 0.001, d = 0.29 95% CI (0.15, 0.44). Reliability ratings did not differ between ‘AI’ and ‘human + AI’ advice, t < 1. Comprehensibility ratings were not affected by the author label, F < 1. Corresponding mixed-effect regression analyses are reported in Supplementary Information. Figure 2 shows the main results of study 2. For all analyses, the pattern mirrors the results observed in study 1. Thus, ‘human’ advice was evaluated as more empathic and more reliable, ts ≥ 3.01, Ps ≤ 0.003, ds ≥ 0.21, but not as more comprehensible, ts < 1, compared with ‘AI’ and ‘human + AI’ advice. Along the same lines, participants indicated a significantly lower willingness to follow the provided advice when AI was believed to be involved in advice generation, ts ≥ 4.46, Ps ≤ 0.001, ds ≥ 0.31. However, the share of participants who saved the link to the (fictious) platform (‘human’: 19.3%; ‘AI’: 18.5%; ‘human + AI’: 22.9%) did not differ between the ‘human’ and the ‘AI’ condition, non-standardized regression coefficient of logistic regression b = 0.05, test statistics of corresponding Wald test z = 0.27, P = 0.789, nor between the ‘human’ and the ‘human + AI’ condition, b = 0.22, z = 1.28, P = 0.200. Detailed results and explorative analyses on interindividual differences are provided in Supplementary Information.
Our findings may be based on a stronger association of the ‘human physician’ label with a mutual demonstration of care and respect, which are vital factors for successful patient–physician interactions14. Hence, our results reinforce that the public perceives physicians as more appropriate sources of medical information than AI-based tools. This outcome is in line with earlier findings on algorithm aversion7, particularly within the medical domain (for example, ref. 11). Conversely, the use of AI may have been perceived as ‘dehumanizing’15, a sentiment highlighted by the lower empathy scores for AI-labeled advice in both of our studies. A further explanation for the observed resistance to AI-generated medical advice might be the phenomenon of ‘uniqueness neglect’, wherein users believe AI may not adequately consider their individual characteristics. Consequently, explaining that even AI-generated advice processes and considers personal information provided by the individuals themselves could potentially increase acceptance16.
Our observation that human-labeled medical advice was not perceived as significantly more comprehensible than AI-labeled information indicates that the label effect exerted less influence on this dimension compared with factors such as reliability and empathy. Possibly, this dimension was perceived as more technical in nature and therefore less critical for sensitive medical settings. In this respect, AI could therefore perform on an equivalent level with human physicians. This observation is noteworthy as it indicates that AI was not generally evaluated more ‘negatively’, which can occur when judgment of one feature influences judgments of other features of the same evaluation object, known as halo effect17.
Furthermore, there was no effect of the author label on the decision to save a link to the platform where the provided responses supposedly were generated. Even though this finding does not necessarily imply that people would also follow AI-generated medical advice, it seems that members of the public at least show interest in corresponding AI-based tools. Whether this initial exploration would also lead to a long-term and unbiased use, however, is yet to be explored.
Both of our studies have limitations. First, to ensure high internal validity, participants in the experiments had to adopt the perspective of other individuals and therefore were unable to formulate their own inquiries. Moreover, the examined dialogs consisted of only one incoming question and a single subsequent response. Thus, our chosen setting was representative of brief interactions occurring on a digital medical platform, while not capturing extensive interactions as those in face-to-face doctor–patient consultations. We consider such a more interactive but also less controlled environment as an intriguing approach for future research.
Our findings indicate a bias against medical advice labeled as AI-generated, regardless of additional supervision by human physicians. Along the same lines, previous research has shown that the public’s reservations toward medical AI persist even when AI-generated content is medically supervised18. Considering the expected surge of AI in healthcare and the immense potential for human–AI collaboration, this finding raises notable concerns. To address this bias, in addition to the general public, other stakeholders, such as physicians and insurance providers, will need to be engaged accordingly. Interestingly, another study showed that if people were assured that humans would remain unequivocally in the decision-making position, the combination of human and AI achieved significantly higher levels of trust than without this assurance19. Consequently, the specific framing of the involvement of AI in generating and delivering medical advice may be pivotal for its public acceptance.
Methods
Ethics and inclusion
All participants received detailed instructions regarding their task, provided informed consent and were debriefed about the study purpose at the end of the experiment. Both of our studies were conducted in accordance with the Declaration of Helsinki. We received formal approval from the ethics committee of the Institute of Psychology of the Faculty of Human Sciences of the University of Würzburg before conducting the studies (GZEK 2023-66).
Study 1
Participants
The study was programmed with lab.js (version 20.2.4 (ref. 20)) and hosted on a private web server. We recruited 1,090 participants via Prolific (www.prolific.com), among which 3.7% (n = 40) did not finish the experiment and were thus excluded from the analysis (final sample size: 1,050; 350 per author label group; self-reported gender identity: 555 males, 489 females, 5 non-binaries, 1 prefer not to say; age: M = 33.0 years, s.d. = 11.5 years). This sample size provided high statistical power to detect even small effects of the author label on reported ratings (1 − β = 95% for d ≥ 0.273, α = 0.05 (where β and α are the type II and type I error probabilities, respectively), two-sample t-test, two-tailed testing, computed in R, version 4.1.1, via the power.t.test function of the stats package version 3.6.2). The majority of this sample indicated a university degree as their highest level of education (3 no formal qualification, 53 secondary education, 265 high school, 500 bachelor, 195 master, 28 PhD, 6 prefer not to say). Participants reported about 60 different nationalities, with South Africa (n = 262), the United Kingdom (n = 174) and Poland (n = 76) mentioned most frequently.
Materials
Case reports
The case reports used in this study address four distinct medical topics: smoking cessation, colonoscopy, agoraphobia and reflux disease (Supplementary Figs. 1–4). Each of these scenarios comprises a brief dialog consisting of an inquiry as it might be presented by a medical layperson using a chat interface on a digital health platform, along with an appropriate response to this inquiry. The queries were constructed and validated by a certified physician. To generate the responses in a style similar to that of popular LLMs, the preceding inquiries were used as prompts for OpenAI’s ChatGPT 3.5. The resultant outcomes were edited in their formulations, supplemented with additional information and scrutinized for medical accuracy by a certified physician. Thus, all case reports constituted a collaboration between AI and a human physician, regardless of the information provided to the participants during the experiment.
Scales
Participants evaluated the presented case reports regarding perceived reliability, comprehensibility and empathy. By using these categories, we closely adhered to existing literature on key evaluation criteria from the patient’s perspective in doctor–patient interactions (see refs. 6,21 for ‘reliability’ and ‘empathy’ and ref. 22 for ‘comprehensibility’). Moreover, these three dimensions allowed us to cover different facets of medical dialogs in a reasonably comprehensive and distinct manner. With ‘reliability’, we addressed the assessment of the content of the medical advice (content-related component). With ‘comprehensibility’, we recorded the public understandability and how accessible the information was structured (format-related component). Finally, with ‘empathy’, we captured the transfer of information on an emotional interpersonal level (interaction-related component). As no established survey instruments with practice-proven suitability for the present research question exist, we developed novel scales closely aligned with best practices in this field. That is, we decided on a relatively low number of response options with individual, unambiguous labels and used symmetrical scales with nonoverlapping categories23,24. The final 7-point Likert scales went from ‘extremely unreliable’ to ‘extremely reliable’, from ‘extremely difficult to understand’ to ‘extremely easy to understand’ and from ‘extremely unempathic’ to ‘extremely empathic’.
For the ‘AI’-label group, ratings for each scale were positively correlated with participants’ attitudes toward AI (perceived opportunities compared with risks, perceived impact for healthcare), Ps ≤ 0.022, thus pointing to high conceptual validity of our scales.
Experimental design and procedure
We used a unifactorial between-subject design, with the manipulated factor being the supposed author of the presented medical information (human, AI, human + AI; Supplementary Fig. 5). Participants were instructed to carefully read all scenarios that were presented in random order. Afterward, we assessed participants’ attitudes toward AI. Hence, we inquired about their frequency of using AI-based tools (response options: never, rarely, occasionally, frequently, very frequently), their perception of the impact of AI on healthcare (response options: no, minor, moderate, significant, highly significant) and whether they view the integration of AI in healthcare as presenting more risks or opportunities (response options: more risks, neutral, more opportunities). Finally, we collected demographic information on gender, age, educational level and nationality.
Data treatment and analyses
We preregistered our analysis plan, data collection strategy and the experimental design (https://osf.io/6trux).
Data analysis was conducted in R version 4.1.1 (R Core Team). A separate analysis of variance was calculated for each rating dimension (reliability, comprehensibility, empathy), using the supposed author of the medical advice as a between-subject factor (human, AI, human + AI). Significant main effects were followed by two-sample t-tests (two-tailed), comparing all factor levels. Cohen’s d is reported as a measure of effect size, which is calculated with the t_out function of the schoRsch package version 1.10 in R (ref. 25). To account for multiple testing, we used the Holm–Bonferroni method to adjust the significance level (α).
As an additional analysis, which we did not preregister, a separate mixed-effect regression analysis was calculated for each rating dimension (reliability, comprehensibility, empathy), using the supposed author of the medical advice (human, AI, human + AI) as a fixed factor and the different scenarios as well as the individual participant as random factors (intercepts). The author label condition was dummy coded with the ‘human’ condition as the reference category. We report absolute values for all statistics and P values were calculated using Satterthwaite’s method. Corresponding results are reported in Supplementary Information.
Study 2
Participants
For study 2, we recruited a new sample of 1,456 participants via Prolific, among which 6.1% (n = 89) did not finish the experiment and were thus excluded from the analysis. As preregistered, we further excluded datasets of participants who failed the attention check (that is, indicated the wrong author label at the end of the study; see ‘Materials and procedure’ for details). This applied to 9.4% (n = 137) of our participants. Thus, our final sample consisted of 1,230 individuals (410 per author label group). For our second study, we exclusively recruited participants from the United Kingdom and our sample was representative of the UK population in terms of age, gender and ethnicity (self-reported gender identity: 595 males, 619 females, 10 non-binaries, 6 prefer not to say; age: M = 47.3 years, s.d. = 15.6 years). Our sample size provided high statistical power to detect even small effects of the author label on reported ratings (1 − β = 90% for d ≥ 0.270, α = 0.01, two-sample t-test, two-tailed testing, computed in R, version 4.1.1, via the power.t.test function of the statistics package). The majority of this sample indicated a university degree as their highest level of education (12 no formal qualification, 146 secondary education, 325 high school, 532 bachelor, 167 master, 40 PhD, 8 prefer not to say).
Materials and procedure
Within our second experiment, we used the same case reports as for study 1. Again, we used a unifactorial between-subject design, with the manipulated factor being the supposed author of the presented medical information (human, AI, human + AI; Supplementary Fig. 5). However, in contrast to study 1, the author label was manipulated only via text instead of via additional symbols. The experimental procedure was similar to that of study 1, but we used two additional measures of preference. Thus, in addition to perceived reliability, comprehensibility and empathy, we also measured the individual willingness to follow the provided advice. To further test the robustness of our survey instruments, we also slightly adapted the scales on which participants rated the respective dimensions. That is, we used 5-point Likert scales (instead of the 7-point scales used in study 1), going from ‘very unreliable’ to ‘very reliable’, from ‘very difficult to understand’ to ‘very easy to understand’, from ‘very unempathic’ to ‘very empathic’ and from ‘very unwilling’ to ‘very willing’. Moreover, at the end of the experiment, participants had the opportunity to save a (fictious) link to the platform and tool, which supposedly generated the previously encountered responses. This tool was framed depending on the experimental condition (‘The previous scenarios where exemplary conversations from a digital platform where users can engage in conversations with a licensed medical doctor (an AI-supported chatbot) regarding medical queries. (All responses on this platform are reviewed by a licensed medical doctor and may be supplemented or revised if necessary.)’). Participants could save this link by clicking on a corresponding button. For each rating dimension, there was a positive relation with the decision to save the link, Ps ≤ 0.012. Moreover, similar to study 1, for the AI condition, attitudes toward AI (perceived opportunities and impact) were positively correlated with ratings in each domain, Ps ≤ 0.001, thus again supporting the validity of our scales. At the end of the study, we again queried participants’ attitudes toward AI and demographic information. In addition, we also assessed participants’ patient status (‘Based on your current health status, would you describe yourself as a patient?’; response options: yes, no, prefer not to say) and whether they work in a healthcare-related profession or received a healthcare-related training (‘Based on your training or current profession, would you describe yourself as a healthcare professional?’; response options: yes, no, prefer not to say). If the latter question was answered with ‘yes’, participants could also indicate their exact profession. Finally, as an attention check, we asked participants who the stated source of the provided medical responses was (‘a licensed medical doctor’, ‘an AI-supported chatbot’, ‘an AI-supported chatbot, revised and supplemented by a licensed medical doctor’).
Data treatment and analyses
We preregistered our analysis plan, data collection strategy and the experimental design (https://osf.io/wn6mj).
Again, data analysis was conducted in R version 4.1.1 (R Core Team). For each rating dimension (reliability, comprehensibility, empathy, willingness to follow), a similar mixed-effect regression analysis was calculated as for study 1. Significant treatment effects were followed by two-sample t-tests (two-tailed), comparing all factor levels. Similar to study 1, Cohen’s d is reported as a measure of effect size. Furthermore, we calculated a binomial logistic regression of the decision to press the ‘save link’ button (yes or no), using the author label condition (human, AI, human + AI) as a fixed factor and the individual participant as a random factor (intercept). The author label condition was dummy coded with the ‘human’ condition as the reference category. We report absolute values for all statistics and P values were calculated using Satterthwaite’s method. Again, the Holm–Bonferroni method was applied to account for multiple testing.
As an exploratory analysis, we correlated individual attitudes toward AI (usage frequency, perceived risk, perceived impact) and further individual characteristics (age, gender, level of education, patient status, healthcare-related profession or training) with ratings of reliability, comprehensibility, empathy, willingness to follow and the decision to save the link to the fictious platform. These calculations were conducted separately for the ‘AI’ and the ‘human + AI’ group. Results for all exploratory analyses are reported in Supplementary Information.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Underlying data for both studies can be found via OSF at https://osf.io/cxb7s/.
Code availability
Analysis code for both studies can be found via OSF at https://osf.io/cxb7s/.
References
Ker, J., Wang, L., Rao, J. & Lim, T. Deep learning applications in medical image analysis. IEEE Access 6, 9375–9389 (2018).
Han, K. et al. A review of approaches for predicting drug–drug interactions based on machine learning. Front. Pharmacol. 12, 814858 (2022).
Nori, H., King, N., McKinney, S. M., Carignan, D. & Horvitz, E. Capabilities of GPT-4 on medical challenge problems. Preprint at https://arxiv.org/abs/2303.13375 (2023).
Li, J. Security implications of AI chatbots in health care. J. Med. Internet Res. 25, e47551 (2023).
Hirosawa, T. et al. ChatGPT-generated differential diagnosis lists for complex case-derived clinical vignettes: diagnostic accuracy evaluation. JMIR Med. Inform. 11, e48808 (2023).
Ayers, J. W. et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern. Med. 183, 589–596 (2023).
Dietvorst, B. J., Simmons, J. P. & Massey, C. Algorithm aversion: people erroneously avoid algorithms after seeing them err. J. Exp. Psychol. Gen. 144, 114–126 (2015).
Young, A. T., Amara, D., Bhattacharya, A. & Wei, M. L. Patient and general public attitudes towards clinical artificial intelligence: a mixed methods systematic review. Lancet Digit. Health 3, e599–e611 (2021).
Choudhury, A., Elkefi, S. & Tounsi, A. Exploring factors influencing user perspective of ChatGPT as a technology that assists in healthcare decision making: a cross sectional survey study. PLoS ONE 19, e0296151 (2024).
Shahsavar, Y. & Choudhury, A. User intentions to use ChatGPT for self-diagnosis and health-related purposes: cross-sectional survey study. JMIR Hum. Factors 10, e47564 (2023).
Gaube, S. et al. Do as AI say: susceptibility in deployment of clinical decision-aids. NPJ Digit. Med. 4, 31 (2021).
Birkhäuer, J. et al. Trust in the health care professional and health outcome: a meta-analysis. PLoS ONE 12, e0170988 (2017).
Shuaib, A., Arian, H. & Shuaib, A. The increasing role of artificial intelligence in health care: will robots replace doctors in the future? Int. J. Gen. Med. 13, 891–896 (2020).
Lu, X., Zhang, R., Wu, W., Shang, X. & Liu, M. Relationship between internet health information and patient compliance based on trust: empirical study. J. Med. Internet Res. 20, e253 (2018).
Formosa, P., Rogers, W., Griep, Y., Bankins, S. & Richards, D. Medical AI and human dignity: contrasting perceptions of human and artificially intelligent (AI) decision making in diagnostic and medical resource allocation contexts. Comput. Hum. Behav. 133, 107296 (2022).
Longoni, C., Bonezzi, A. & Morewedge, C. K. Resistance to medical artificial intelligence. J. Consum. Res. 46, 629–650 (2019).
Thorndike, L. E. A constant error in psychological ratings. J. Appl. Psychol. 4, 35–29 (1920).
Esmaeilzadeh, P., Mirzaei, T. & Dharanikota, S. Patients’ perceptions toward human–artificial intelligence interaction in health care: experimental study. J. Med. Internet Res. 23, e25856 (2021).
Aoki, N. The importance of the assurance that ‘humans are still in the decision loop’ for public trust in artificial intelligence: evidence from an online experiment. Comput. Hum. Behav. 114, 106572 (2021).
Henninger, F., Shevchenko, Y., Mertens, U. K., Kieslich, P. J. & Hilbig, B. E. lab.js: a free, open, online study builder. Behav. Res. Methods 54, 556–573 (2022).
Regula, C. G., Miller, J. J., Mauger, D. T. & Marks, J. G. Quality of care from a patient’s perspective. Arch. Dermatol. 143, 1592–1593 (2007).
Kremers, M. N. T. et al. Patient’s perspective on improving the quality of acute medical care: determining patient reported outcomes. BMJ Open Qual. 8, e000736 (2019).
Khadka, J., Gothwal, V. K., McAlinden, C., Lamoureux, E. L. & Pesudovs, K. The importance of rating scales in measuring patient-reported outcomes. Health Qual. Life Outcomes 10, 80 (2012).
Garratt, A. M., Helgeland, J. & Gulbrandsen, P. Five-point scales outperform 10-point scales in a randomized comparison of item scaling for the Patient Experiences Questionnaire. J. Clin. Epidemiol. 64, 200–207 (2011).
Pfister, R. & Janczyk, M. schoRsch: an R package for analyzing and reporting factorial experiments. Quant. Method Psych. 12, 147–151 (2016).
Acknowledgements
We thank V. Mocke for helpful comments on the study design. This work was supported by the Faculty of Humanities of the University of Würzburg. M.R. is supported by a PhD scholarship from the German Academic Scholarship Foundation. The funders had no role in the study design, data collection and analysis, decision to publish or preparation of the paper.
Funding
Open access funding provided by Julius-Maximilians-Universität Würzburg.
Author information
Authors and Affiliations
Contributions
M.R. conceived the study; did investigations; provided methodology, software, validation, formal analysis and data curation; wrote the original draft; reviewed and edited the paper; provided visualization and project administration; and acquired funds. F.R. conceived the study, provided methodology, wrote the original draft, reviewed and edited the paper and provided project administration. W.K. conceived the study, provided methodology, reviewed and edited the paper, provided resources, supervised the study and acquired funds.
Corresponding author
Ethics declarations
Competing interests
F.R. is a current employee of Pfizer Pharma GmbH in Berlin, Germany. Pfizer had no substantive or financial involvement in the conception, implementation or analysis of this study, nor in the creation or publication of the associated paper. The other authors declare no competing interests.
Peer review
Peer review information
Nature Medicine thanks Avishek Choudhury, Nils Köbis and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Lorenzo Righetto, in collaboration with the Nature Medicine team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–5 and Results.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Reis, M., Reis, F. & Kunde, W. Influence of believed AI involvement on the perception of digital medical advice. Nat Med (2024). https://doi.org/10.1038/s41591-024-03180-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41591-024-03180-7
This article is cited by
-
Patient attitudes toward the AI doctor
Nature Medicine (2024)