Abstract
This study explored the application of generative pre-trained transformer (GPT) agents based on medical guidelines using large language model (LLM) technology for traumatic brain injury (TBI) rehabilitation-related questions. To assess the effectiveness of multiple agents (GPT-agents) created using GPT-4, a comparison was conducted using direct GPT-4 as the control group (GPT-4). The GPT-agents comprised multiple agents with distinct functions, including “Medical Guideline Classification”, “Question Retrieval”, “Matching Evaluation”, “Intelligent Question Answering (QA)”, and “Results Evaluation and Source Citation”. Brain rehabilitation questions were selected from the doctor-patient Q&A database for assessment. The primary endpoint was a better answer. The secondary endpoints were accuracy, completeness, explainability, and empathy. Thirty questions were answered; overall GPT-agents took substantially longer and more words to respond than GPT-4 (time: 54.05 vs. 9.66 s, words: 371 vs. 57). However, GPT-agents provided superior answers in more cases compared to GPT-4 (66.7 vs. 33.3%). GPT-Agents surpassed GPT-4 in accuracy evaluation (3.8 ± 1.02 vs. 3.2 ± 0.96, p = 0.0234). No difference in incomplete answers was found (2 ± 0.87 vs. 1.7 ± 0.79, p = 0.213). However, in terms of explainability (2.79 ± 0.45 vs. 07 ± 0.52, p < 0.001) and empathy (2.63 ± 0.57 vs. 1.08 ± 0.51, p < 0.001) evaluation, the GPT-agents performed notably better. Based on medical guidelines, GPT-agents enhanced the accuracy and empathy of responses to TBI rehabilitation questions. This study provides guideline references and demonstrates improved clinical explainability. However, further validation through multicenter trials in a clinical setting is necessary. This study offers practical insights and establishes groundwork for the potential theoretical integration of LLM-agents medicine.
Similar content being viewed by others
Introduction
Based on date provided by the World Health Organization, traumatic brain injury (TBI) is the third leading cause of death globally1, accounting for nearly half of all injury-related deaths worldwide1,2. Moreover, TBI is a major cause of acquired disability worldwide; however, effective treatment methods are scarce3. Brain trauma can lead to head injuries, skull fractures, brain tissue damage, and, in severe cases, coma, memory loss, and cognitive impairment. Owing to the limited regenerative capacity of the nervous system, the rehabilitation of patients with brain trauma is a lengthy process3.
Recently the use of artificial intelligence (AI) to provide personalized medical services for clinical brain rehabilitation has gained significant attention4. AI offers the advantage of providing prompt diagnostic and therapeutic recommendations for brain rehabilitation. An emerging area of research is the use of Large Language Models (LLM) as a tool for rehabilitation support, which has gained traction in a variety of fields, including chronic pulmonary disease5, rehabilitation education6, and physical and skeletal rehabilitation7,8. Despite advancements in LLM, this technology has limitations, including issues with accuracy and comprehensiveness9,10. LLM may also generate “Hallucinations” 11,12, making them unsuitable for providing professional medical advice. Moreover, the lack of explainability13,14 of the output results makes it difficult for doctors and patients to establish trust when interacting with a “robotic system”.
In the field of GPT technology, the use of agents is considered the latest approach for tackling complex problems15,16. This approach has demonstrated exceptional performance in fields such as programming17, gaming18, and even complex computer tasks19. However, the application of this agent in the medical field remains in the nascent stage. This study therefore aimed to explore the use of an agent technology based on medical guidelines that can provide responses to user inputs. Simultaneously, relevant content from medical guidelines were output within the responses to enhance the explainability of the results.
This study comprised a comparative analysis of the responses between direct GPT-4 and GPT-agents (constructed based on guidelines). A set of brain rehabilitation questions was selected from the doctor-patient Q&A database for assessments. The primary endpoint was a better answer, whereas the secondary endpoints included accuracy, completeness, explainability, and empathy.
Results
ChatGPT-agents question-answering system
Thirty random questions (Supplementary Table 2) were answered, and it was observed that GPT-agents took significantly longer to respond than GPT-4 (54.05 vs. 9.66 s per question). The “Results Evaluation and Source Citation” agent had the longest response time (Table 1, Fig. 1). Regarding word count, GPT-4 answered in an average of 57 words, which was significantly fewer than the average of 371 words for GPT-agents (Fig. 2).
Evaluation results
Three evaluators assessed the responses to 30 random questions (Supplementary Table 2). Based on the evaluation results, GPT-agents was found to have provided superior answers in most cases (n = 20, 66.7%) compared to GPT-4 (n = 10, 33.3%). Chi-square analysis revealed that GPT-agents significantly outperformed the GPT-4 group (χ2 = 6.667, p = 0.0098). Further analysis of accuracy evaluation, revealed that the guideline-based GPT-agents (3.8 ± 1.02) outperformed GPT-4 (3.2 ± 0.96, p = 0.0234). However, completeness evaluation showed that both models showed incomplete answers, with no significant difference (2 ± 0.87 vs. 1.7 ± 0.79, p = 0.213). However, in terms of explainability and empathy evaluation, the GPT-agents performed significantly better than GPT-4 (Table 2, Fig. 3).
In the response analysis, when faced with information not covered in the guidelines, GPT-agents explicitly indicated “unclear”, instead of fabricating conflicting content with the guidelines (Supplementary Table 2). In the evaluation section of the results, the GPT-agents explicitly indicated whether the answers were correct, and the specific content from the guidelines (Table 3).
Discussion
In this study, medical guidelines20 and agents based on GPT-4 were used to answer questions related to TBI rehabilitation. This system automatically evaluates the correctness of the answers, simultaneously providing relevant content from the medical guidelines to enhance explainability. The evaluation revealed that the responses generated by the guideline-based GPT-agents performed better in terms of accuracy, explainability, and empathy than those obtained by directly querying GPT-4.
Brain rehabilitation is a comprehensive and lengthy treatment process involving a variety of aspects including physical therapy, speech therapy, cognitive training, and psychological support21,22. The LLM acquires knowledge from various professional disciplines during training, making it highly suitable for assisting with brain rehabilitation.
Currently, the LLM has demonstrated potential in the medical field23,24, owing to its powerful natural language processing and generation capabilities25,26. However, the direct use of LMM is still limited by certain challenges, such as inaccurate responses or the generation of hallucinations. Agents based on LLM for complex task processing have shown significant advantages. For example, humans typically perform autonomous programming or automation of certain real-world tasks using computers or smartphones. Agents can also be employed in medical tasks such as for dermatological patient-doctor conversations and assessments27. The GPT-agents constructed in this study involved multiple API calls, which results in the generation of lengthier answers, but can increase the response time. Overall, it was found that the GPT-agents had an extended response time compared to GPT-4, but could still provide answers within an average range of 1–2 min, generating an output with a word count between 300 and 700 words (in Chinese). This speed is acceptable for clinical counseling, as it is much shorter than the real-world waiting time in hospitals for treatment.
Traditional direct question-answering systems such as ChatGPT have been found to be limited potential issues related to accuracy28,29 and the generation of hallucinatory responses for medical queries30,31. Medical guidelines and expert consensus thus serve as the cornerstone of clinical practice. GPT-4 has powerful summarization capabilities29, making it a potential tool for guideline classification. In the present study, we observed that after inputting guideline information into the GPT-4, its medical role was significantly activated, leading to improved response accuracy. We further found that the inclusion of guidelines did not directly restrict the agents’ responses. Overall, our GPT-agents could provide suggestions during result evaluation, which offers an alternative when there is no answer available based on the guidelines.
Several studies have previously attempted to improve the accuracy and completeness of LLM by including prompt engineering, fine-tuning, and retraining29,32. Considering the high cost of fine-tuning and retraining, this study focused instead on prompt engineering techniques. By utilizing guideline-based agents to process the guidelines and input them as prompts to the GPT, the accuracy of the agents’ responses improved significantly. This improvement could be attributed to the prompt use of medical guidelines, which better set the context and cultural positioning of GPT. Guidelines are commonly modified to suit the specific healthcare environments in a particular region. Thus, different healthcare environments and conditions may implement slightly different approaches for the same medical issue. For example, Traditional Chinese Medicine is often incorporated into medical guidelines and consensus in China20. This study followed a logical chain of thinking, incorporating knowledge from medical guidelines, and employed multiple evaluative agents to assess the questions and answers. We believe that providing professional medical guidelines and utilizing evaluative agents are superior strategies for enhancing response quality.
Completeness is defined as the accumulation of experience in long-term clinical work involving insights and reflections on multiple dimensions of illness. In the present study, we found that both GPT-agents and GPT-4 were lacking in terms of completeness, indicating that their ability to answer medical questions is still in the early stages of development. Further research should explore whether combining fine-tuned teleology can improve completeness.
Explainability is an important criterion when evaluating the current use of AI in medicine14,33. Because of their large number of parameters, LLMs are inherently difficult to explain. In the present study, the explainability of the results was assessed by referencing the original text of the guidelines. After the answer was evaluated as “correct” or “incorrect”, the related original text of the referenced guideline content was output by the final agent. This significantly increases the explanatory power of the results.
Patients with brain injury often require a lengthy recovery period, and rely on their families for reintegration into society. Empathy can help family members to understand and motivate patients, thus boosting their confidence in treatment. The GPT-4 itself seems to have an advantage over clinical doctors in terms of empathy34,35. In the present study, we found that GPT-agents had significantly enhanced empathy compared to the base GPT-4. This may be attributed to the inclusion of more medical information, which provided the GPT with more precise positioning and allowed it to generate words associated with empathy.
Although this study found that GPT-agents based on medical guidelines could significantly improve medical responses, there are still some limitations which should be considered. First, the use of GPT-agents results in an increase in the cost time. Overall, we found an average increase of 1 min in response time for GPT-agents in our study. However, this may be affected by different areas and Internet environments. Secondly, there is the issue of incomplete answers. Clinical practice is complex and involves multiple disciplines. However, no single guideline can adequately address these complex clinical issues. Guidelines are constantly evolving, and may not always align with the most advanced treatment approaches. As such, these guidelines must be critically evaluated. Incorporating a wide and non-duplicate summary guideline can help to overcome this problem. Third, this study did not employ random double-blinding owing to the inclusion of guideline references in the GPT-agents’ responses, making it impossible to implement blinding on assessors, which could have led to subjectivity in the results. Finally, the actual medical environments in hospitals are complex and variable, involving individual patient situations, medical histories, and symptoms. Additionally, ethical and medical regulations differ across regions. ChatGPT may not have fully considered these factors when answering questions, thus limiting the applicability of its responses. As such, when using the GPT, healthcare professionals and clinical teams must maintain professional judgment, integrate GPT responses with specific patient contexts, and develop the best diagnosis and treatment plans accordingly.
In future research, optimization could be continued through several approaches. First, it will be necessary to further refine the foundational large models, particularly by upgrading them to multimodal models. This is crucial, as many patients with clinical brain injury may not be able to complete typing or speaking tasks. Utilizing various input modes (such as voice and images) can help to broaden accessibility. Second, further studies should explore whether agents based on medical guidelines exhibit common patterns in other conditions, such as rare diseases or critical illnesses. It is essential to determine whether employing guideline-based agents can enhance the responses of LLMs. Finally, as various diseases and medical guidelines intersect, research on recommendation algorithms will be necessary. This algorithm should accurately assess and rank diverse search contents, discerning patients' true intentions, as different diseases involve varying guidelines, and a single condition may have multiple treatment guidelines.
Despite these limitations, our research showed that GPT-agents that rely on medical guidelines hold significant promise for various medical applications. By integrating evidence-based guidelines, these agents can utilize the wealth of knowledge and expertise accumulated through extensive clinical practice and research. This integration not only improves the reliability of the generated responses, but also ensures their alignment with established medical standards and best practices.
Overall, the results of this study showed that GPT-agents have enhanced the accuracy and empathy of responses to TBI rehabilitation questions. This study provides guideline references and demonstrates improved clinical explainability. Compared to the direct use of GPT-4, GPT-agents based on medical guidelines showed improved performance, despite the slight increase in response time. With advances in technology, this delay is expected to be minimized. However, further validation through multicenter trials in a clinical setting is necessary. Overall, this study offers practical insights and establishes the groundwork for the potential theoretical integration of LLM-agents in the field of medicine.
Methods
This study employed a cross-sectional, non-human subject research design. A flowchart of the study design is shown in Fig. 1. As this study did not involve human or animal participants, and ChatGPT/OpenAI could freely access Kaggle.com via the API, Ethical Committee Approval was not required.
Several LLM are currently available; online models include Google's Bard, Microsoft's Bing, Baidu's Wenxin Yiyan, IFLYTEK's Spark, and OpenAI's GPT-series, among others. Offline deployable options include lama and chatglm. Given the popularity of GPT-4 among our research team, GPT-4 was chosen as the foundational model.
In the present study, Multiple agents were constructed using GPT-4, including "Medical Guideline Classification”, “Question Retrieval”, “Matching Evaluation”, “Intelligent Question-Answering”, and “Results Evaluation and Source Citation” (Fig. 1) . The knowledge for the agents was derived from expert consensus or guidelines on brain injury rehabilitation from China.
Design of guideline-based ChatGPT-agents (GPT-agents)
Guideline-based GPT-agents were designed based on GPT-4. The primary objective of an intelligent agent is to retrieve and provide word suggestions as answers. An evaluation was introduced for each of the steps mentioned above, resulting in five intelligent agents (Table 1). The first agent was responsible for the clustering analysis of the guidelines, extracting the topics and subtopics of each section, and then saving all of these extracted topics for later reference and retrieval. The second agent searched the inputted question within the subtopics, and the output was the question + the related content of medical guideline from the first agent. The third agents performed a “Matching Evaluation,” to check whether the question and the content were relevant. The fourth agent was question-answering agent which synchronously input the user's question and corresponding topic-related content into the GPT-4 model to generate the answer to the question. Finally, the fifth agent performed two functions: firstly, it evaluated the accuracy of the generated answer by comparing it with the contents of the guidelines, and secondly it produced the final response along with the relevant guideline content that corresponding to this response (Fig. 1A).
The program was deployed on the Kaggle platform (Kaggle.com), and OpenAI’s GPT-4 API was utilized for automated question answering. The program automatically recorded the number of words generated as well as the time consumed. The first agent responsible was categorization, which only ran once and did not participate in the answer-generation process. Therefore, time and words were not recorded for this agent. For the second and third agents, as their results mainly involved returning potential content from the guidelines and "True/False" answers, the words was not recorded as well.
The direct-GPT(GPT-4)
The direct question-and-answer design was based on GPT-4, utilizing the same environment as GPT-agents. Within the design, all questions were posed within a "for" loop (similarly to in GPT-agents group), and GPT-4 directly generated responses (Fig. 1B). The process recorded all the content, including the time consumed and the word count of the generated answers.
The medical guidelines
The references for TBI rehabilitation guidelines were obtained by searching a specialized Chinese database that collects all clinical guidelines and expert consensuses (Clinical Guidelines Network, https://guide.medlive.cn). Brain rehabilitation guidelines and standards were retrieved and thoroughly reviewed by a clinician (L.Z.Z.) with 14 years of clinical work experience. After clinical evaluation, the expert consensus20 that best aligns with Chinese TBI rehabilitation, was incorporated into the system to make it more comprehensive and inclusive of the content from traditional Chinese medicine.
Question data collection
First, 300 real-world brain rehabilitation-related questions from doctor-patient interactions were collected from online sources. Two medical experts (L.Z.Z and Z.W), both with over 10 years of clinical experience, who worked at the same Grade A tertiary hospital, manually collected 300 Chinese brain injury rehabilitation-related questions from two open-source Chinese medical dialogue datasets (https://github.com/Toyhom/Chinese-medical-dialogue-data, datasets/ FreedomIntelligence/huatuo_knowledge_graph_qa) and one website (https://youlai.cn/). Each question is accompanied by an answer, and the responses to these questions are publicly available. These questions cover the various stages of brain injury rehabilitation. Second, we randomly selected 30 questions to ask and evaluate using a computer method (code:random.choice(list,30)).
The inclusion criteria were as follows: (1) questions related to brain rehabilitation; (2) answers by medical experts available; (3) publicly available question-and-answer pairs without involving personal privacy; and (4) no copyright restrictions. The exclusion criteria were as follows: (1) inadequate responses prompting further hospital visits; (2) questions focusing on severe complications in vital organs such as the heart or kidneys; (3) unanswered questions by doctors; and 4) questions violating medical ethics or Chinese laws in questions or answers.
Evaluation for GPT-agents and GPT-4
The valuation team members included a chief physician (Z. J. F.), a senior physician (L. Z. Z.), and a nurse (X. R. Y.), all of whom had more than 10 years’ experience in clinical practice. The primary endpoint was better answers, whereas the secondary endpoint includes accuracy, completeness, explainability, and empathy.
First, a better evaluation of both answers (GPT-4 and GPT-agents) was required. Next, we evaluated the four sub-dimensions of accuracy, completeness, explainability, and empathy separately.
We developed a Likert scoring scale to evaluate the responses. To ensure accuracy, we referenced previous studies36 and adopted a continuous 5–0 rating system. The others were evaluated using a continuous 3–0 scale. A higher score signified strong agreement, whereas a score of 0 indicated strong disagreement (Supplementary Table 1).
Statistical analysis
Categorical data of the primary endpoint are presented as the number of cases and their respective rates. Comparisons between groups were performed using the chi-square or Fisher’s exact tests. Other measurement data for the normal distribution are presented as means ± standard deviations, and comparisons between groups was conducted using two independent sample t-tests. The measurement data for skewed distribution are presented as medians and quartile ranges. The level of statistical significance was set at p < 0.05. All statistical analyses were performed using GraphPad software (version 8). The time consumed and word count were displayed using Matplotlib in Python 3.10.
Data availability
The original data presented in the study are included in the article/supplementary material.
References
Posti, J. P., Kytö, V., Sipilä, J. O. T., Rautava, P. & Luoto, T. M. High-risk periods for adult traumatic brain injuries: A nationwide population-based study. Neuroepidemiology 55, 216–223 (2021).
Capizzi, A., Woo, J. & Verduzco-Gutierrez, M. Traumatic brain injury. Med. Clin. N. Am. 104, 213–238 (2020).
Marklund, N. et al. Treatments and rehabilitation in the acute and chronic state of traumatic brain injury. J. Intern. Med. 285, 608–623 (2019).
Guo, Y. et al. Artificial intelligence-assisted repair of peripheral nerve injury: A new research hotspot and associated challenges. Neural Regen. Res. 19, 663–670 (2024).
Hasnain, M., Hayat, A. & Hussain, A. Revolutionizing chronic obstructive pulmonary disease care with the open AI application: ChatGPT. Ann. Biomed. Eng. 51, 2100–2102 (2023).
Peng, S. et al. AI-ChatGPT/GPT-4: An booster for the development of physical medicine and rehabilitation in the New Era!. Ann. Biomed. Eng. 52, 462–466 (2023).
McBee, J.C., Han, D.Y., Liu, L., et al. Interdisciplinary inquiry via PanelGPT: Application to explore chatbot application in sports rehabilitation. medRxiv (2023).
Rossettini, G., Cook, C., Palese, A., Pillastrini, P. & Turolla, A. Pros and cons of using artificial intelligence Chatbots for musculoskeletal rehabilitation management. J. Orthop. Sport Phys. 53, 728–734 (2023).
He, Y. et al. Will ChatGPT/GPT-4 be a lighthouse to guide spinal surgeons?. Ann. Biomed. Eng. 51, 1362–1365 (2023).
Kuang, Y. et al. ChatGPT encounters multiple opportunities and challenges in neurosurgery. Int. J. Surg. 109, 2886–2891 (2023).
Perera Molligoda Arachchige, A. S. Large language models (LLM) and ChatGPT: a medical student perspective. Eur. J. Nucl. Med. Mol. I(50), 2248–2249 (2023).
Zhang, Y., Li, Y., Cui, L., et al. Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models. Ithaca: Cornell University Library. arXiv.org (2023).
Zhao, H., Chen, H., Yang, F., et al. Explainability for Large Language Models: A Survey. Ithaca: Cornell University Library. arXiv.org (2023).
Chen, H., Gomez, C., Huang, C. M. & Unberath, M. Explainable medical imaging AI needs human-centered design: Guidelines and evidence from a systematic review. Npj Digit. Med. 5, 156 (2022).
Gupta, B., Mufti, T., Sohail, S. S. & Madsen, D. Ø. ChatGPT: A brief narrative review. Cogent Bus. Manag. 10, 2275851 (2023).
Lin, B.Y., Fu, Y., Yang, K., et al. SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks. Ithaca: Cornell University Library. arXiv.org (2023).
Kim, G., Baldi, P., & McAleer, S. Language Models can Solve Computer Tasks. Ithaca: Cornell University Library. arXiv.org (2023).
Wu, Y., Prabhumoye, S., So, Y.M., et al. SPRING: Studying the paper and reasoning to play games. Ithaca: Cornell University Library. arXiv.org (2023).
Wang, G., Xie, Y., Jiang, Y., et al. Voyager: An open-ended embodied agent with large language models. In. Ithaca: Cornell University Library. arXiv.org (2023).
Chinese Medical Association Neurosurgery Branch, Chinese Neurosurgical Intensive Care Collaboration Group. Expert Consensus on Early Rehabilitation Management of Severe Craniocerebral Trauma in China (2017). Chin. Med. J. 97, 1615–1623 (2017).
Cheng, K. et al. The potential of GPT-4 as an AI-powered virtual assistant for surgeons specialized in joint arthroplasty. Ann. Biomed. Eng. 51, 1366–1370 (2023).
Zhang, L., Tashiro, S., Mukaino, M. & Yamada, S. Use of artificial intelligence large language models as a clinical tool in rehabilitation medicine: A comparative test case. J. Rehabil. Med. 55, m13373 (2023).
Sacco, S. & Ornello, R. Headache research in 2023: Advancing therapy and technology. Lancet Neurol. 23, 17–19 (2024).
Thirunavukarasu, A. J. et al. Large language models in medicine. Nat. Med. 29, 1930–1940 (2023).
Harris, E. Large language models answer medical questions accurately, but can’t match clinicians’ knowledge. JAMA J. Am. Med. Assoc. 330, 792–794 (2023).
Noy, S. & Zhang, W. Experimental evidence on the productivity effects of generative artificial intelligence. Sci. (Am. Assoc. Adv. Sci.) 381, 187–192 (2023).
Johri, S., Jeong, J., Tran, B.A., & Schlessinger, DI. Testing the limits of language models: A conversational framework for medical AI assessment. medRxiv (2023).
Shah, N. H., Entwistle, D. & Pfeffer, M. A. Creation and adoption of large language models in medicine. JAMA-J. Am. Med. Assoc. 330, 866–869 (2023).
Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023).
Rajpurkar, P., Lungren, M. P., Drazen, J. M., Kohane, I. S. & Leong, T. The current and future state of AI interpretation of medical images. N. Engl. J. Med. 388, 1981–1990 (2023).
Strong, E. et al. Chatbot versus medical student performance on free-response clinical reasoning examinations. JAMA Intern. Med. 183, 1028–1030 (2023).
Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616, 259–265 (2023).
Yang, G., Ye, Q. & Xia, J. Unbox the black-box for the medical explainable AI via multi-modal and multi-centre data fusion: A mini-review, two showcases and beyond. Inform. Fus. 77, 29–52 (2022).
Wachter, R. M. & Brynjolfsson, E. Will generative artificial intelligence deliver on its promise in health care?. JAMA-J. Am. Med. Assoc. 331, 65–69 (2024).
Ayers, J. W. et al. Comparing physician and artificial intelligence Chatbot responses to patient questions posted to a public social media forum. JAMA Intern. Med. 183, 589 (2023).
Barlas, T., Altinova, A. E., Akturk, M. & Toruner, F. B. Credibility of ChatGPT in the assessment of obesity in type 2 diabetes according to the guidelines. Int. J. Obes. 48, 271–275 (2024).
Acknowledgements
We extend our heartfelt gratitude to Xu Ruiyu and the neurosurgery team at Ningbo No.2 Hospital and the volunteers who participated in this study, for their valuable contributions. We would also like to express our appreciation to Kaggle (www.kaggle.com) for providing the online platform and their free services.
Funding
The study has been funded by the Project of NINGBO Leading Medical&Health Discipline(2022-S02), 2021 Hwa Mei Medical Education Research Project (2021HMJYZD05), HwaMei Research Foundation of Ningbo No. 2 Hospital (2024HMZD16), 2022 Ningbo Health and Technology Plan Project(2022Y10, 2022Y11), Ningbo Major Science & Technology Project( 2022Z126), HwaMei Research Foundation of Ningbo No. 2 Hospital (2023HMZD03).
Author information
Authors and Affiliations
Contributions
Conceptualization,Z.J.J, L.Z.Z.; methodology, L.Z.Z,Z.W; software, L.Z.Z.; validation, Z.J.J. and Z.J.F.; formal analysis, L.Z.Z,X.Y.S.; data curation, Z.J.F.,L.Z.Z.; writing original draft preparation, L.Z.Z.; writing review and editing, L.Z.Z,Z.J.F.; visualization, L.Z.Z,Z.J.J; supervision, Z.J.J; project administration, Z.J.J.; funding acquisition, L.Z.Z. All authors have read and agreed to the published version of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhenzhu, L., Jingfeng, Z., Wei, Z. et al. GPT-agents based on medical guidelines can improve the responsiveness and explainability of outcomes for traumatic brain injury rehabilitation. Sci Rep 14, 7626 (2024). https://doi.org/10.1038/s41598-024-58514-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-58514-9
Keywords
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.