Prompt engineering affects large language models' performance in GI oncology.
Prompts with templates and in-context learning enhance large language models' output.
Multi-round interaction helps large language models to reach the best performance.
Such performance meets the need of senior GI oncologists for effective AI agents.
[1] | OpenAI, R. (2023). GPT-4 technical report. arXiv 2303.08774. DOI: 10.48550/arXiv.2303.08774. |
[2] | Lee, P., Bubeck, S., and Petro, J. (2023). Benefits, limits, and risks of GPT-4 as an AI chatbot for medicine. N. Engl. J. Med. 388: 1233-1239. DOI: 10.1056/NEJMsr2214184. |
[3] | Lee, P., Goldberg, C., and Kohane, I. (2023). The AI revolution in medicine: GPT-4 and beyond (Pearson Education, Limited). |
[4] | Xu, Y., Liu, X., Cao, X., et al. (2021). Artificial intelligence: A powerful paradigm for scientific research. The Innovation 2: 100179. DOI: 10.1016/j.xinn.2021.100179. |
[5] | Nori, H., King, N., McKinney, S.M., et al. (2023). Capabilities of GPT-4 on medical challenge problems. arXiv preprint arXiv:2303.13375. DOI: 10.48550/arXiv.2303.13375. |
[6] | Ayers, J.W., Poliak, A., Dredze, M., et al. (2023). Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern. Med. 183: 589-596. DOI: 10.1001/jamainternmed.2023.1838. |
[7] | Haver, H.L., Ambinder, E.B., Bahl, M., et al. (2023). Appropriateness of breast cancer prevention and screening recommendations provided by ChatGPT. Radiology 307: e230424. DOI: 10.1148/radiol.230424. |
[8] | Zhu, L., Mou, W., and Chen, R. (2023). Can the ChatGPT and other large language models with internet-connected database solve the questions and concerns of patient with prostate cancer and help democratize medical knowledge? J. Transl. Med. 21: 1-4. DOI: 10.1186/s12967-022-03835-4. |
[9] | Uprety, D., Zhu, D., and West, H.J. (2023). ChatGPT-a promising generative AI tool and its implications for cancer care. Cancer 129: 2284-2289. DOI: 10.1002/cncr.34827. |
[10] | Zhong, Y., Chen, Y.J., Zhou, Y., et al. (2023). The artificial intelligence large language models and neuropsychiatry practice and research ethic. Asian J. Psychiatr. 84: 103577. DOI: 10.1016/j.ajp.2023.103577. |
[11] | Young, J.N., Ross, O.H., Poplausky, D., et al. (2023). The utility of ChatGPT in generating patient-facing and clinical responses for melanoma. J. Am. Acad. Dermatol. 89 : 602-604. DOI: 10.1016/j.jaad.2023.05.024. |
[12] | Xie, Y., Seth, I., Hunter-Smith, D.J., et al. (2023). Aesthetic surgery advice and counseling from artificial intelligence: a rhinoplasty consultation with ChatGPT. Aesth. Plast. Surg. 47 : 1985-1993. DOI: 10.1007/s00266-023-03338-7. |
[13] | Buzzaccarini, G., Degliuomini, R.S., and Borin, M. (2023). The artificial intelligence application in aesthetic medicine: How ChatGPT can revolutionize the aesthetic world. Aesth. Plast. Surg. 47 : 2211-2212. DOI: 10.1007/s00266-023-03416-w. |
[14] | Radford, A., Wu, J., Child, R., et al. (2019). Language models are unsupervised multitask learners. OpenAI blog 1 : 9. https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf. |
[15] | Brown, T., Mann, B., Ryder, N., et al. (2020). Language models are few-shot learners. NeurIPS 33: 1877−1901. DOI: 10.48550/arXiv.2005.14165. |
[16] | Lewis, P., Perez, E., Piktus, A., et al. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. NeurIPS 33: 9459−9474. DOI: 10.48550/arXiv.2005.11401. |
[17] | Wei, J., Wang, X., Schuurmans, D., et al. (2022). Chain-of-thought prompting elicits reasoning in large language models. NeurIPS 35: 24824−24837. DOI: 10.48550/arXiv.2201.11903. |
[18] | Zhou, D., Schärli, N., Hou, L., et al. (2022). Least-to-most prompting enables complex reasoning in large language models. arXiv preprint arXiv:2205.10625. DOI: 10.48550/arXiv.2205.10625. |
[19] | Yao, S., Yu, D., Zhao, J., et al. (2023). Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601. DOI: 10.48550/arXiv.2305.10601. |
[20] | Fu, Y., Peng, H., Sabharwal, A., et al. (2022). Complexity-based prompting for multi-step reasoning. arXiv preprint arXiv:2210.00720. DOI: 10.48550/arXiv.2210.00720. |
[21] | Khot, T., Trivedi, H., Finlayson, M., et al. (2022). Decomposed prompting: A modular approach for solving complex tasks. arXiv preprint arXiv:2210.02406. DOI: 10.48550/arXiv.2210.02406. |
[22] | White, J., Fu, Q., Hays, S., et al. (2023). A prompt pattern catalog to enhance prompt engineering with ChatGPT. arXiv preprint arXiv:2302.11382. DOI: 10.48550/arXiv.2302.11382. |
[23] | Suzgun, M., Scales, N., Schärli, N., et al. (2022). Challenging big-bench tasks and whether chain-of-thought can solve them. arXiv preprint arXiv:2210.09261. DOI: 10.48550/arXiv.2210.09261. |
Yuan J., Bao P., Chen Z., et al., (2023). Advanced prompting as a catalyst: Empowering large language models in the management of gastrointestinal cancers. The Innovation Medicine 1(2), 100019. https://doi.org/10.59717/j.xinn-med.2023.100019 |
An illustration showcasing the effects of various prompting strategies on Language Learning Models' (LLMs') performance, mediated by a 'storage of knowledge'
Utilizing GPT-4 to Suggest Oncological Treatment Regimens with Distinct Prompting Techniques
Evaluation of prompting template design
A Multi-round Interaction with GPT-4 for Gastric Cancer Treatment Advice