Welcome to visit Zhongnan Medical Journal Press Series journal website!

Home Articles Vol 34,2025 No.8 Detail

Application and frontier exploration of retrieval-augmented generation technology in medical artificial intelligence

Published on Aug. 29, 2025Total Views: 42 times Total Downloads: 10 times Download Mobile

Author: JIN Zhe 1 ZOU Jian 1 LI Xiao 1 Lyu Jiaxin 1 HU Zhongxu 2 FENG Da 1

Affiliation: 1. School of Pharmacy, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China 2. School of Mechanical Science & Engineering, Huazhong University of Science and Technology, Wuhan 430074, China

Keywords: Large language model Retrieval-augmented generation Hallucination problem Medical information retrieval

DOI: 10.12173/j.issn.1005-0698.202503219

Reference: JIN Zhe, ZOU Jian, LI Xiao, Lyu Jiaxin, HU Zhongxu, FENG Da. Application and frontier exploration of retrieval-augmented generation technology in medical artificial intelligence[J]. Yaowu Liuxingbingxue Zazhi, 2025, 34(8): 961-970. DOI: 10.12173/j.issn.1005-0698.202503219.[Article in Chinese]

  • Abstract
  • Full-text
  • References
Abstract

With the rapid rise of large language models(LLM), the natural language generation capabilities of deep learning have demonstrated significant value in the medical field. However, the "closed nature" of model parameters makes them prone to generating "hallucinations", making it difficult to provide accurate answers to the latest knowledge, and the reasoning process lacks transparency and traceability. Retrieval-augmented generation (RAG) technology addresses these issues by actively connecting external information sources such as document databases and knowledge graphs during the generation process. This significantly reduces the dependence of LLM on outdated training data and introduces verifiable evidence and real-time knowledge updates into their responses. In the medical field, RAG technology effectively addresses the high-accuracy and traceability requirements of literature retrieval and clinical decision support. It is widely applied in areas such as drug discovery, pharmacovigilance, and the diagnosis and treatment of rare diseases. By integrating emerging technologies such as reinforcement learning, multimodal processing, and compliant privacy protection, RAG technology is evolving towards a more open and highly customizable direction, providing innovative intelligent solutions for medical information retrieval and decision-making  support.

Full-text
Please download the PDF version to read the full text: download
References

1.Zhao WX, Zhou K, Li J, et al. A survey of large language models[EB/OL]. (2025-03-11) [2025-08-01]. https://doi.org/10.48550/arXiv.2303.18223.

2.李欣桐, 马素芬, 张丰聪, 等. 中医药领域大语言模型的研究进展与应用前景[J]. 南京中医药大学学报, 2024, 40(12): 1393-1403. [Li XT, Ma SF, Zhang FC, et al. Research progress and application prospect of large language model in the traditional Chinese medicine[J]. Journal of Nanjing University of Traditional Chinese Medicine, 2024, 40(12): 1393-1403.] DOI: 10.14148/j.issn.1672-0482.2024.1393.

3.施呈昊, 屠馨怡, 史佳伟, 等. 大语言模型临床实践应用范围综述[J]. 医学信息学杂志, 2024, 45(9): 19-26. [Shi CH, Tu XY, Shi JW, et al. A scoping review of the application of large language models in clinical practice[J]. Journal of Medical Informatics, 2024, 45(9): 19-26.] DOI: 10.3969/j.issn.1673-6036. 2024.09.003.

4.刘泓泽, 王耀国, 唐圣晟, 等. 医学大语言模型的应用现状与发展趋势研究[J]. 中国数字医学, 2024, 19(8): 1-7, 13. [Liu HZ, Wang YG, Tang SS, et al. Research on the current application and development trend of medical large language model[J]. China Digital Medicine, 2024, 19(8): 1-7, 13.] DOI: 10.3969/j.issn.1673-7571.2024.08.001.

5.熊灏. 融合知识图谱的大语言模型幻觉问题研究[D]. 广州: 广东工业大学, 2024.

6.Ji Z, Lee N, Frieske R, et al. Survey of hallucination in natural language generation[J]. ACM Comput Surv, 2023, 55(12): 1-38. DOI: 10.1145/3571730.

7.Kandpal N, Deng H, Roberts A, et al. Large language models struggle to learn long-tail knowledge[EB/OL]. (2023-07-27) [2025-08-01]. https://doi.org/10.48550/arXiv.2211.08411.

8.Zhang Y, Li Y, Cui L, et al. Siren's song in the AI ocean: a survey on hallucination in large language models[EB/OL]. (2023-09-24) [2025-08-01]. https://doi.org/10.48550/arXiv.2309.01219.

9.Gao Y, Xiong Y, Gao X, et al. Retrieval-augmented generation for large language models: a survey[EB/OL]. (2024-03-27) [2025-08-01]. https://doi.org/10.48550/arXiv.2312.10997.

10.Cheng M, Luo Y, Ouyang J, et al. A survey on knowledge-oriented retrieval-augmented generation[EB/OL]. (2025-03-17) [2025-08-01]. https://doi.org/10.48550/arXiv.2503.10677.

11.Lewis P, Perez E, Piktus A, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks[EB/OL]. (2021-04-12) [2025-08-01]. https://doi.org/10.48550/arXiv.2005.11401.

12.Xu H, Yang Z, Zhu Z, et al. Delusions of large language models[EB/OL]. (2025-03-09) [2025-08-01]. https://doi.org/10.48550/arXiv.2503.06709.

13.Barros S. I think, therefore I hallucinate: minds, machines, and the art of being wrong[EB/OL]. (202503-04) [2025-08-01]. https://doi.org/10.48550/arXiv.2503.05806.

14.Bo JY, Wan S, Anderson A. To rely or not to rely? evaluating interventions for appropriate reliance on large language models[EB/OL]. (2025-03-09) [2025-08-01]. https://doi.org/10.48550/arXiv.2412.15584.

15.Thorne S. Understanding and evaluating trust in generative AI and large language models for spreadsheets[EBOL]. (2024-12-18) [2025-08-01]. https://doi.org/10.48550/arXiv.2412.14062.

16.Levy S, Mazor N, Shalmon L, et al. More documents, same length: isolating the challenge of multiple documents in RAG[EB/OL]. (2025-03-06) [2025-08-01]. https://doi.org/10.48550/arXiv.2503.04388.

17.Ye Q, Liu J, Chong D, et al. Qilin-med: multi-stage knowledge injection advanced medical large language model[EB/OL].(2024-04-17) [2025-08-01]. https://doi.org/10.48550/arXiv. 2310.09089.

18.Das S, Ge Y, Guo Y, et al. Two-layer retrieval-augmented generation framework for low-resource medical question answering using reddit data: proof-of-concept study[J]. J Med Internet Res, 2025, 27: e66220. DOI: 10.2196/66220.

19.Long C, Liu Y, Ouyang C, et al. Bailicai: a domain-optimized retrieval-augmented generation framework for medical aApplications[EB/OL]. (2024-07-24) [2025-08-01]. https://doi.org/10.48550/arXiv.2407.21055.

20.Li D, Jiang N, Huang K, et al. From questions to clinical recommendations: large language  models driving evidence-based clinical decision making[EB/OL]. (2025-05-15) [2025-08-01]. https://doi.org/10.48550/arXiv.2505.10282.

21.Wang H, Feng Y, Xie X, et al. Path pooling: train-free structure enhancement for efficient knowledge graph retrieval-augmented generation[EB/OL]. (2025-05-27) [2025-08-01]. https://doi.org/10.48550/arXiv.2503.05203.

22.Sui R. CtrlRAG: black-box adversarial attacks based on masked language models in retrieval-augmented language generation[EB/OL]. (2025-03-10) [2025-08-01]. https://doi.org/10.48550/arXiv.2503.06950.

23.Cheng Y, Zhao Y, Zhu J, et al. Human cognition inspired RAG with knowledge graph for complex problem solving[EB/OL]. (2025-03-09) [2025-08-01]. https://doi.org/10.48550/arXiv.2503.06567.

24.Jiang Z, Sun M, Zhang Z, et al. Bi'an: a bilingual benchmark and model for hallucination detection in retrieval-augmented generation[EB/OL]. (2025-02-26) [2025-08-01]. https://doi.org/10.48550/arXiv.2502.19209.

25.Schick T, Dwivedi-Yu J, Dessì R, et al. Toolformer: language models can teach themselves to use tools[EB/OL]. (2023-02-09) [2025-08-01]. https://doi.org/10.48550/arXiv.2302.04761.

26.Patil SG, Zhang T, Wang X, et al. Gorilla: large language model connected with massive APIs[EB/OL]. (2023-03-24) [2025-08-01]. https://arxiv.org/abs/2305.15334.

27.Qin Y, Liang S, Ye Y, et al. ToolLLM: facilitating large language models to master 16000+ real-world APIs[EB/OL]. (2023-10-03) [2025-08-01]. https://doi.org/10.48550/arXiv.2307.16789.

28.Ovadia O, Brief M, Mishaeli M, et al. Fine-tuning or retrieval? comparing knowledge injection in LLMs[EB/OL]. (2024-01-30)[2025-08-01]. https://doi.org/10.48550/arXiv.2312.05934.

29.Liang L, Sun M, Gui Z, et al. KAG: boosting LLMs in professional domains via knowledge augmented generation[EB/OL]. (2024-09-26) [2025-08-01]. https://doi.org/10.48550/arXiv.2409.13731.

30.Kang M, Kwak JM, Baek J, et al. Knowledge graph-augmented language models for knowledge-grounded dialogue generation[EB/OL]. (2023-03-30) [2025-08-01]. https://doi.org/10.48550/arXiv.2305.18846.

31.Lavrinovics E, Biswas R, Bjerva J, et al. Knowledge graphs, large language models, and hallucinations: an NLP perspective[EB/OL]. (2024-09-21) [2025-08-01]. https://doi.org/10.48550/arXiv.2411.14258.

32.Hamza A, Abdullah, Ahn YH, et al. LLaVA needs more knowledge: retrieval augmented natural language generation with knowledge graph for explaining thoracic pathologies[EB/OL]. (2024-12-19) [2025-08-01]. https://doi.org/10.48550/arXiv.2410.04749.

33.Zhao X, Liu S, Yang S, et al. MedRAG: enhancing retrieval-augmented generation with knowledge graph-elicited reasoning for healthcare copilot[EB/OL]. (2025-06-27) [2025-08-01]. https://doi.org/10.48550/arXiv.2502.04413.

34.Jiang J, Chen J, Li J, et al. RAG-Star: enhancing deliberative reasoning with retrieval augmented verification and refinement[EB/OL]. (2024-12-17) [2025-08-01]. https://doi.org/10.48550/arXiv.2412.12881.

35.Sankararaman H, Yasin MN, Sorensen T, et al. Provenance: a light-weight fact-checker for retrieval augmented LLM generation output[EB/OL]. (2024-11-01) [2025-08-01]. https://doi.org/10. 48550/arXiv.2411.01022.

36.Wang Y, Krotov D, Hu Y, et al. M+: extending MemoryLLM with scalable long-term memory[EB/OL]. (2025-05-30) [2025-08-01]. https://doi.org/10.48550/arXiv.2502.00592.

37.Wang Y, Gao Y, Chen X, et al. MEMORYLLM: towards self-updatable large language models[EB/OL]. (2024-05-26) [2025-08-01]. https://doi.org/10.48550/arXiv.2402.04624.

38.Li S, Zhang K, Liu Q, et al. MindBridge: scalable and cross-model knowledge editing via memory-augmented modality[EB/OL]. (2025-03-04) [2025-08-01]. https://doi.org/10.48550/arXiv.2503.02701.

39.Xu M, Liang G, Chen K, et al. Memory-augmented query reconstruction for LLM-based knowledge graph reasoning[EB/OL]. (2025-03-07) [2025-08-01]. https://doi.org/10.48550/arXiv. 2503.05193.

40.Muhiyaddin R, Abd-Alrazaq AA, Househ M, et al. The impact of clinical decision support systems (CDSS) on physicians: a scoping  review[J]. Stud Health Technol Inform, 2020, 272: 470-473. DOI: 10.3233/SHTI200597.

41.Sutton RT, Pincock D, Baumgart DC, et al. An overview of clinical decision support systems: benefits, risks, and strategies for success[J]. NPJ Digit Med, 2020, 3(1): 17. DOI: 10.1038/s41746-020-0221-y.

42.Remy F, Demuynck K, Demeester T. BioLORD-2023: semantic textual representations fusing LLM and clinical knowledge graph insights[EB/OL]. (2023-11-27) [2025-08-01]. https://doi.org/10.48550/arXiv.2311.16075.

43.Kadhim A Z, Green Z, Nazari I, et al. Application of generative artificial intelligence to utilise unstructured clinical data for acceleration of inflammatory bowel disease research[EB/OL]. (2025-03-07) [2025-08-01]. https://doi.org/10.1101/2025.03.07.25323569.

44.Choi J, Palumbo N, Chalasani P, et al. MALADE: orchestration of LLM-powered agents with retrieval augmented generation for pharmacovigilance[EB/OL]. (2024-08-03) [2025-08-01]. https://doi.org/10.48550/arXiv.2408.01869.

45.Zheng Y, Koh HY, Yang M, et al. Large language models in drug discovery and development: from disease mechanisms to clinical trials[EB/OL]. (2024-09-06) [2025-08-01]. https://doi.org/10.48550/arXiv.2409.04481.

46.Nassiri K, Akhloufi MA. Recent advances in large language models for healthcare[J]. BioMedInformatics, 2024, 4(2): 1097-1143. https://doi.org/10.3390/biomedinformatics4020062.

47.王之. 多模态中文大语言模型医疗问答研究与应用[D]. 武汉: 湖北工业大学, 2024.

48.张玉铭, 李红岩, 郎许锋, 等. 基于检索增强生成技术的中医药问答大语言模型的构建[J]. 南京中医药大学学报, 2024, 40(12): 1375-1382. [Zhang YM, Li HY, Lang XF, et al. Construction of traditional Chinese medicine question-answering large language model based on retrieval-augmented generation technology[J]. Journal of Nanjing University of Traditional Chinese Medicine, 2024, 40(12): 1375-1382.] DOI: 10.14148/j.issn.1672-0482.2024.1375.

49.Wang L, Wan Z, Ni C, et al. A systematic review of ChatGPT and other conversational large language models in healthcare[J/OL]. medRxiv[Preprint], [2025-08-01]. DOI: 10.1101/2024.04.26.24306390.

50.梁泽宇. 多选项医疗问答模型研究与问答系统实现[D]. 北京: 北京交通大学, 2023.

Popular papers
Last 6 months