Welcome to visit Zhongnan Medical Journal Press Series journal website!

Home Articles New Online Detail

Large language models empowering pharmacoepidemiology research

Published on Aug. 13, 2025Total Views: 35 times Total Downloads: 8 times Download Mobile

Author: SI Shucheng 1, 2# WU Liuliu 1, 2# WANG Conghui 3 YANG Ziming 4 DU Jian 5 WANG Shengfeng 2, 6 ZHAN Siyan 1, 2, 6, 7

Affiliation: 1. Research Center of Clinical Epidemiology, Peking University Third Hospital, Beijing 100191, China 2. Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing 100191, China 3. Drug Monitoring and Evaluation Division, Inner Mongolia Pharmacovigilance Center, Hohhot 010010, China 4. Department of Human Resources, Peking University First Hospital, Beijing 100034, China 5. National Institute of Health Data Science, Beijing 100191, China 6. Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing 100191, China 7. Center for Intelligent Public Health, Institute for Artificial Intelligence, Peking University, Beijing 100871, China

Keywords: Artificial intelligence Large language models Pharmacoepidemiology Drug discovery Drug repurposing Pharmacovigilance

  • Abstract
  • Full-text
  • References
Abstract

The emergence of artificial intelligence (AI) has had a significant impact on medical research and practice, both in terms of the number of studies and research paradigms, and has become an important tool for the development of pharmacoepidemiology. However, traditional AI has faced many challenges, while facilitating pharmacoepidemiology research, such as complex data processing, difficulty in identifying drug exposures and potential outcomes, and time-consuming and laborious study design and implementation. The rapid development of generative AI, represented by large language models (LLMs), has demonstrated a unique potential to enhance research efficiency, shift research paradigms, and facilitate knowledge discovery. LLMs are equipped with natural language understanding and generation capabilities. Through deep mining of multi-dimensional data resources, LLMs can quickly and accurately extract, analyze, summarize, and present the required information, which can not only help drug discovery, drug repurposing, pharmacovigilance and other pharmacoepidemiological tasks, but also provide powerful support for the whole process of research protocol design, data analysis, result interpretation and paper publication. Driven by LLMs, pharmacoepidemiology research is gradually moving into a new stage based on big data and automated analysis. Of course, LLMs also have problems of data bias, “illusion” of results, and ethical and legal regulation. By strengthening interdisciplinary cooperation, establishing a standardized evaluation system, improving ethical and regulatory guidance, enhancing data quality, strengthening practitioner training and capacity building, and promoting human-machine collaborative research modes, it is expected that the potential of LLMs in pharmacoepidemiology will be fully released, and it will provide a more scientific, rapid, and efficient technological support for drug regulation and public health decision-making.

Full-text
Please download the PDF version to read the full text: download
References

1.Copeland BJ. Artificial intelligence (AI). Definition, examples, types, applications, companies, & facts[EB/OL]. (2024-10-25)[2025-02-01]. https://www.britannica.com/technology/artificial-intelligence.

2.Gangwal A, Ansari A, Ahmad I, et al. Generative artificial intelligence in drug discovery: basic framework, recent advances, challenges, and opportunities[J]. Front Pharmacol, 2024, 15: 1331062. DOI: 10.3389/fphar.2024.1331062.

3.Zhang K, Yang X, Wang Y, et al. Artificial intelligence in drug development[J]. Nat Med, 2025, 31(1): 45-59. DOI: 10.1038/s41591-024-03434-4.

4.Picard M, Leclercq M, Bodein A, et al. Improving drug repositioning with negative data labeling using large language models[J]. J Cheminform, 2025, 17(1): 16. DOI: 10.1186/s13321-025-00962-0.

5.Rough K, Rashidi ES, Tai CG, et al. Core concepts in pharmacoepidemiology: principled use of artificial intelligence and machine learning in pharmacoepidemiology and healthcare research[J]. Pharmacoepidemiol Drug Saf, 2024, 33(11): e70041. DOI: 10.1002/pds.70041.

6.李佩芳, 陈佳丽, 宁宁, 等. ChatGPT在医学领域的应用进展及思考[J]. 华西医学, 2023, 38: 1456-1460. [Li PF, Chen JL, Ning N, et al. Application progress and thinking of ChatGPT in medical domain[J]. West China Medical Journal, 2023, 38: 1456-1460.] DOI: 10.7507/1002-0179.202309179.

7.Schmallenbach L, Bärnighausen TW, Lerchenmueller MJ. The global geography of artificial intelligence in life science research[J]. Nat Commun, 2024, 15(1): 7527. DOI: 10.1038/s41467-024-51714-x.

8.Meskó B, Görög M. A short guide for medical professionals in the era of artificial intelligence[J]. NPJ Digit Med, 2020, 3: 126. DOI: 10.1038/s41746-020-00333-z.

9.Wang H, Fu T, Du Y, et al. Publisher correction: scientific discovery in the age of artificial intelligence[J]. Nature, 2023, 621(7978): E33. DOI: 10.1038/s41586-023-06559-7.

10.Li X, Guo Y. Paradigm shifts from data-intensive science to robot scientists[J]. Sci Bull(Beijing), 2025, 70(1): 14-18. DOI: 10.1016/j.scib.2024.09.029.

11.Pividori M, Greene CS. A publishing infrastructure for AI-assisted academic authoring[J/OL]. bioRxiv, 2023. (2023-01-21) [2025-02-15]. DOI: 10.1101/2023.01.21.525030.

12.何剑虎, 王德健, 赵志锐, 等. 大语言模型在医疗领域的前沿研究与创新应用[J]. 医学信息学杂志, 2024, 45(9): 10-18. [He JH, Wang DJ, Zhao ZR, et al. The frontier research and innovative applications of large language models in the medical field[J]. Journal of Medical Informatics, 2024, 45(9): 10-18.] DOI: 10.3969/j.issn.1673-603 6.2024.09.002.

13.Wang S, Zhao Z, Ouyang X, et al. Interactive computer-aided diagnosis on medical image using large language models[J]. Commun Eng, 2024, 3(1): 133. DOI: 10.1038/s44172-024-00271-8.

14.Akilesh S, Sheik Abdullah A, Abinaya R, et al. A novel AI-based chatbot application for personalized medical diagnosis and review using large language models[C]. 2023 International Conference on Research Methodologies in Knowledge Management, Artificial Intelligence and Telecommunication Engineering (RMKMATE), 2023: 1-5.

15.施呈昊, 屠馨怡, 史佳伟, 等. 大语言模型临床实践应用范围综述[J]. 医学信息学杂志, 2024, 45(9): 19-26. [Shi CH, Tu XY, Shi JW, et al. A scoping review of the application of large language models in clinical practice[J]. Journal of Medical Informatics. 2024, 45(9): 19-26.] DOI: 10.3969/j.issn.1673-6036. 2024.09.003.

16.Wang H, Fu T, Du Y, et al. Scientific discovery in the age of artificial intelligence[J]. Nature, 2023, 620(7972): 47-60. DOI: 10.1038/s41586-023-06221-2.

17.Bukhtiyarova O, Abderrazak A, Chiu Y, et al. Major areas of interest of artificial intelligence research applied to health care administrative data: a scoping review[J]. Front Pharmacol, 2022, 13: 944516. DOI: 10.3389/fphar.2022.944516.

18.Sessa M, Khan AR, Liang D, et al. Artificial intelligence in pharmacoepidemiology: a systematic review. Part 1-overview of knowledge discovery techniques in artificial intelligence[J]. Front Pharmacol, 2020, 11: 1028. DOI: 10.3389/fphar.2020.01028.

19.谈志远, 赵荣生. 人工智能技术在药物不良反应监测与上报中应用的研究进展[J]. 临床药物治疗杂志, 2019, 17(2): 23-27. [Tan ZY, Zhao RS. Progress of studies of artificial intelligence in surveillance and report of adverse drug reactions[J]. Clinical Medication Journal, 2019, 17(2): 23-27.] DOI: 10.3969/j.issn.1672-3384.2019.02.006.

20.Sessa M, Liang D, Khan AR, et al. Artificial intelligence in pharmacoepidemiology: a systematic review. Part 2-comparison of the performance of artificial intelligence and traditional pharmacoepidemiological techniques[J]. Front Pharmacol, 2021, 11: 568659. DOI: 10.3389/fphar.2020.568659.

21.Li Y, Tao W, Li Z, et al. Artificial intelligence-powered pharmacovigilance: a review of machine and deep learning in clinical text-based adverse drug event detection for benchmark datasets[J]. J Biomed Inform, 2024, 152: 104621. DOI: 10.1016/j.jbi.2024.104621.

22.Wang J, Cheng Z, Yao Q, et al. Bioinformatics and biomedical informatics with ChatGPT: year one review[J]. Quanti Bio, 2024, 12(4): 345-359. DOI: 10.1002/qub2.67.

23.Chakraborty C, Bhattacharya M, Lee SS, et al. The changing scenario of drug discovery using AI to deep learning: Recent advancement, success stories, collaborations, and challenges[J]. Mol Ther Nucleic Acids, 2024, 35(3): 102295. DOI: 10.1016/j.omtn.2024.102295.

24.Pun FW, Ozerov IV, Zhavoronkov A. AI-powered therapeutic target discovery[J]. Trends Pharmacol Sci, 2023, 44(9): 561-572. DOI: 10.1016/j.tips.2023.06.010.

25.Arnold C. Inside the nascent industry of AI-designed drugs[J]. Nat Med, 2023, 29(6): 1292-1295. DOI: 10.1038/s41591-023-02361-0.

26.Gao Z, Li L, Ma S, et al. Examining the potential of ChatGPT on biomedical information retrieval: fact-checking drug-disease associations[J]. Ann Biomed Eng, 2024, 52(8): 1919-1927. DOI: 10.1007/s10439-023-03385-w.

27.Guo T, Guo K, Nan B, et al. What can large language models do in chemistry? a comprehensive benchmark on eight tasks[A].in:Proceedings of the 37th International Conference on Neural Information Processing Systems[C]. Red Hook, NY, USA: Curran Associates Inc, 2023: 59662-59688.

28.Juhi A, Pipil N, Santra S, et al. The capability of ChatGPT in predicting and explaining common drug-drug interactions[J]. Cureus, 2023, 15(3): e36272. DOI: 10.7759/cureus.36272.

29.Al-Ashwal FY, Zawiah M, Gharaibeh L, et al. Evaluating the sensitivity, specificity, and accuracy of ChatGPT-3.5, ChatGPT-4, Bing AI, and Bard against conventional drug-drug interactions clinical tools[J]. Drug Healthc Patient Saf, 2023, 15: 137-147. DOI: 10.2147/DHPS.S425858.

30.Chen Q, Sun H, Liu H, et al. An extensive benchmark study on biomedical text generation and mining with ChatGPT[J]. Bioinformatics, 2023, 39(9): btad557. DOI: 10.1093/bioinformatics/btad557.

31.Huang X, Estau D, Liu X, et al. Evaluating the performance of ChatGPT in clinical pharmacy: a comparative study of ChatGPT and clinical pharmacists[J]. Br J Clin Pharmacol, 2024, 90(1): 232-238. DOI: 10.1111/bcp.15896.

32.Wang R, Feng H, Wei GW. ChatGPT in drug discovery: a case study on anticocaine addiction drug development with Chatbots[J]. J Chem Inf Model, 2023, 63(22): 7189-7209. DOI: 10.1021/acs.jcim.3c01429.

33.Liu S, Wang J, Yang Y, et al. Conversational drug editing using retrieval and domain feedback[C]. The Twelfth International Conference on Learning Representations, 2023. https://openreview.net/forum?id=yRrPfKyJQ2.

34.Liang Y, Zhang R, Zhang L, et al. DrugChat: towards enabling ChatGPT-like capabilities on drug molecule graphs[J/OL]. arXiv, 2023. (2023-05-18) [2025-02-25]. http://arxiv.org/abs/2309.03907.

35.Ye G, Cai X, Lai H, et al. DrugAssist: a large language model for molecule optimization[J]. Brief Bioinform, 2024, 26(1): bbae693. DOI: 10.1093/bib/bbae693.

36.Cai X, Lai H, Wang X, et al. Comprehensive evaluation of molecule property prediction with ChatGPT[J]. Methods , 2024, 222: 133-141. DOI: 10.1016/j.ymeth.2024.01.004.

37.Zeng Z, Yin B, Wang S, et al. ChatMol: interactive molecular discovery with natural language[J]. Bioinformatics, 2024, 40(9): btae534. DOI: 10.1093/bioinformatics/btae534.

38.Wei J, Bosma M, Zhao V, et al. Finetuned language models are zero-shot learners[C]. International Conference on Learning Representations, 2021. https://openreview.net/forum?id= gEZrGCozdqR.

39.Zhao Z, Chen B, Li J, et al. ChemDFM-X: towards large multimodal model for chemistry[J]. Sci China Inform Sci, 2024, 67(12): 220109. https://arxiv.org/abs/2409.13194.

40.Wei J, Zhuo L, Fu X, et al. DrugReAlign: a multisource prompt framework for drug repurposing based on large language models[J]. BMC Biol, 2024, 22(1): 226. DOI: 10.1186/s12915-024-02028-3.

41.Matheny ME, Yang J, Smith JC, et al. Enhancing postmarketing surveillance of medical products with large language models[J]. JAMA Netw Open, 2024, 7(8): e2428276. DOI: 10.1001/jamanetworkopen.2024.28276.

42.Ong JCL, Chen MH, Ng N, et al. A scoping review on generative AI and large language models in mitigating medication related harm[J]. NPJ Digit Med, 2025, 8(1): 182. DOI: 10.1038/s41746-025-01565-7.

43.Wang X, Xu X, Liu Z, et al. Bidirectional encoder representations from transformers-like large language models in patient safety and pharmacovigilance: a comprehensive assessment of causal inference implications[J]. Exp Biol Med (Maywood), 2023, 248(21): 1908-1917. DOI: 10.1177/15353702231215895.

44.Ma C, Wolfinger RD. Toward an explainable large language model for the automatic identification of the drug-induced liver injury literature[J]. Chem Res Toxicol, 2024, 37(9): 1524-1534. DOI: 10.1021/acs.chemrestox.4c00134.

45.Deng Y, Xing Y, Quach J, et al. Developing large language models to detect adverse drug events in posts on x[J]. J Biopharm Stat, 2024: 1-12. DOI: 10.1080/10543406.2024.2403442.

46.Giuffrè M, Shung DL. Harnessing the power of synthetic data in healthcare: innovation, application, and privacy[J]. NPJ Digit Med, 2023, 6(1): 186. DOI: 10.1038/s41746-023-00927-3.

47.Ocana A, Pandiella A, Privat C, et al. Integrating artificial intelligence in drug discovery and early drug development: a transformative approach[J]. Biomarker Res, 2025, 13: 45. DOI: 10.1186/s40364-025-00758-2.

Popular papers
Last 6 months