MedPrompt: Care Coordination and Pathways for Cancer
Patients using Digital Information
Nikitha Pagadala Bhavika Mandadi Manvitha Chowtapalli
22WH1A0572 22WH1A0595 22WH1A0597
Under the Guidance of
Dr.Akansha Tyagi
Professor
Dept of C.S.E,
BVRIT Hyderabad College of Engineering for Women.
June 3, 2025
Table of Content
1 Problem Statement
2 Abstract
3 Literature Survey
4 Research Gaps
5 Methodology
6 Technology stack
7 Results
8 References
2022-2026 Department of CSE June 3, 2025 2 / 13
Problem Statement
Unstructured clinical data and complex medical jargon make it difficult for patients
and healthcare professionals to extract meaningful insights.
Existing AI and LLM solutions lack structured, interpretable formats suitable for in-
tegration into clinical workflows.
There is a gap in combining Named Entity Recognition (NER), entity filtering, and
Large Language Models (LLMs) for personalized, privacy-preserving, and cost-efficient
cancer care.
The challenge lies in automating clinical documentation, patient-specific query re-
sponse, and maintaining information privacy across digital healthcare platforms.
2022-2026 Department of CSE June 3, 2025 3 / 13
Abstract
Good health is a prominent need for human beings.
Recent research shows the rapid emergence of AI in the healthcare domain.
Healthcare problems can be addressed through the following 9 concepts:
1 Personalized
2 Predictive
3 Participatory
4 Precise
5 Preventive
6 Pervasive
7 Privacy-preserving
8 Protective
9 Price-reasonable
Clinical data is one of the key modules for conducting any medical health research.
2022-2026 Department of CSE June 3, 2025 4 / 13
Abstract (contd..)
According to Microsoft Bing and Google search data:
In just 10 minutes, 12,000 people from the global population search for a health-related
query.
AI agents such as ChatGPT, Gemini, and Claude have helped update solutions to these
health-related queries, making them more understandable for users.
The main question remains: How to make health solutions:
Predictive
Precise
Privacy-preserving
Price-reasonable
The proposed approach aims to solve these four problems.
2022-2026 Department of CSE June 3, 2025 5 / 13
Literature Survey (1/2)
2022-2026 Department of CSE June 3, 2025 6 / 13
Literature Survey (2/2)
2022-2026 Department of CSE June 3, 2025 7 / 13
Research Gaps
The majority of systems (e.g., CLEAR) cut off at ranking and entity retrieval — they
do not produce structured EHR-style summaries with LLMs.
RL algorithms require improvements in sample efficiency and scalability to make them
more practical for real-world applications.
Systems such as SciSpacy and MetaMap face challenges with vague words (e.g., "RA"
= Right Atrium or Rheumatoid Arthritis) and proper linking to ontologies such as
UMLS.
Conventional IR techniques such as BM25 apply lexical matching, commonly neglecting
patient-specific information or temporal history.
Models such as BioBERT and ClinicalBERT are limited to domain-dedicated datasets
(e.g., MIMIC-III), excluding rare diseases and more general clinical scenarios.
2022-2026 Department of CSE June 3, 2025 8 / 13
Methodology
1 Data Collection: Curated datasets from open-access medical repositories (e.g., MIMIC-
III, PubMed, Medline).
2 Preprocessing: Tokenization, sentence splitting, removal of PHI (Protected Health
Information).
3 NER and Entity Linking: Using tools like SciSpacy for medical term identification
and UMLS linking.
4 Entity Filtering: Filtering based on relevance to cancer pathways and treatment
protocols.
5 LLM-Based Summarization: Employing models such as ChatGPT or BioGPT for
generating EHR-style outputs.
6 Validation: Outputs compared against expert-annotated samples for accuracy and
interpretability.
2022-2026 Department of CSE June 3, 2025 9 / 13
Technology
Libraries
Transformers
Pandas
NumPy
Tools
1. Ollama LLM
2. stream lit
3. Python subprocess module
2022-2026 Department of CSE June 3, 2025 10 / 13
Results (1/2)
Figure 1: MedPrompt User Figure 2: Raw Clinical free text Figure 3: Output of Clinical
Interface as input NER
2022-2026 Department of CSE June 3, 2025 11 / 13
Results (2/2)
Figure 4: Definitions and Figure 5: Concise structured Figure 6: Summary and
contextual information output visualization module
2022-2026 Department of CSE June 3, 2025 12 / 13
References
Yuan, M., Bao, P., Yuan, J., Shen, Y., Chen, Z., Xie, Y., Zhao, J., Li, Q., Chen, Y., Zhang, L. and Shen,
L., 2024. Large language models illuminate a progressive pathway to artificial intelligent healthcare assistant.
Medicine Plus, p.100030.
Bose, P., Srinivasan, S., Sleeman IV, W.C., Palta, J., Kapoor, R. and Ghosh, P., 2021. A survey on recent named
entity recognition and relationship extraction techniques on clinical texts. Applied Sciences, 11(18), p.8319.
Kothari, A.N., 2023. ChatGPT, large language models, and generative AI as future augments of surgical cancer
care. Annals of Surgical Oncology, 30(6), pp.3174-3176.
Khan, M.A., Ayub, U., Naqvi, S.A.A., Khakwani, K.Z.R., Sipra, Z.B.R., Raina, A., Zhou, S., He, H., Saeidi, A.,
Hasan, B. and Rumble, R.B., 2025. Collaborative large language models for automated data extraction in living
systematic reviews. Journal of the American Medical Informatics Association, p.ocae325.
Deroy, A. and Maity, S., 2024. Cancer-Answer: Empowering Cancer Care with Advanced Large Language Models.
arXiv preprint arXiv:2411.06946.
2022-2026 Department of CSE June 3, 2025 13 / 13