Retrospective patient chart reviews of electronic medical records is a crucial process for quality improvement efforts that impact clinical care, medical billing (cost effective care), and internal compliance standards. However, the traditional process of using a team of physicians is time-consuming, labor-intensive, and requires training on a shared criteria to avoid inconsistencies. As a Generative AI fellow at X = Primary Care (XPC) in Spring 2025, for my self-directed project I developed an end-to-end python app, QuICR, that automates chart review for the primary care clinical setting using natural language processing techniques.
- 1-2 hours per chart review reduced to < 1 minute.
- Cost-effective: < $0.01 per chart review.
- Just hit "Enter" and the app does the rest. No repetitive copy-pasting.
- Supplies suggestions to improve the documentation of the chart for each documentation issue identified.
- Provides a birds-eye-view of all patient chart reviews to identify trends in a user's documentation practices and ranks the charts based on the severity of the documentation issue.
- Reduces the variability in chart reviews by providing a structured and standardized criteria to identify documentation issues.
- Finds the generic names of medications and retrieves the price of the medicatons from Walmart's generic drug list and CostPlusDrugs to aid cost effective care.
- The insights are presented to the user in an organized and easy to read format (inspired by Swiss design principles) provided in a web browser (HTML) and as a PDF for easy sharing.
Chart Review on Synthetic Patient, Bill Moore (Click to image to enlarge or view corresponding PDF.)
Aggregate Documentation Report (Click to enlarge image or view corresponding PDF.)
- Prompt engineering.
- System prompt path:
prompt/system/system_prompt_chart_review_2.txt
- System prompt path:
- Structured Outputs and JSON Schemas.
- OpenAI’s Structured Outputs feature via the API guarantees reliability in complex, multi-step NLP tasks.
- My custom specification of the JSON schemas enforces strict adherenece to the defined LLM responce format.
- JSON schema path:
prompt/json_schema/
- JSON schema path:
- Named-Entity Recognition (NER).
- Standardization of medications to generic names using Unified Medical Language System (UMLS) linker via SciSpaCy.
- The Jinja2 templates I created define the visual organization (i.e., HTML structure) for the key highlights, problem plans, anticipatory health maintenance, and follow-up plan.
- WeasyPrint renders the HTML to PDF with the custom CSS styles.
- Together, they create a professional report that is easy to read and share with others.
- The inference code captures token usage metrics to aid in monitoring cost and processing speed over time. For examples:
generated_output/o4-mini-2025-04-16/usage
- The environment is fully reproducible via the provided
myenv.ymlConda specification, ensuring that all dependencies (Python, SciSpacy, WeasyPrint, etc.) can be installed consistently across Linux, macOS, and Windows.
- After cloning the repository, create a conda environment using the
myenv.ymlfile. Activate the environment using the commandconda activate quicr. - Assign your OpenAI API key to the
OPENAI_API_KEYenvironment variable in your.envfile. Note: this is separate from having a ChatGPT Plus account and one needs to add funds to their OpenAI Platform account to use the API. - While in the root of the project directory, run the app using the command
python app.pyin your terminal.
Synthetic data that closely resemble primary care patient chart data (e.g., demographics, medications, lab results, and clinical notes) was used to test the feasibility of the app. This data was created by expert primary care physicians to represent the patient data they have encountered in their practice.
This is a functional application to demonstrate the value of the technologies and techniques used in my project. While the outputs are impressive, testing on numerous cases is required. In its current form, the app (and its componenets) are not intended to be used for clinical decision making without expert physician supervision or to replace human judgement in patient care. Walmart and CostPlusDrugs are not affiliated with this project and their mention in this project is not an endorsement; the prices of medications may vary by location, and are subject to change.
Evaluation studies involving thousands of patient charts are needed to fully assess the performance of the app to capture the nuances of patient clinical presentations and the complexity of clinical decision making. Hybrid frameworks that evaluate GenAI outputs with human input and an LLM evaluator with a criteria set such as EvalGen to assess the performance of the app. This approach is suitable because it allows for human judgement to be incorporated for nuanced cases and can aid in aligning the LLM evaluator with human appraisal as it processes a large number of charts faster than humans.
I would like to thank Paulius Mui, M.D. (founder of X = Primary Care) for his mentorship and support throughout the fellowship.
Morris A. Aguilar, Ph.D.
XPC Generative AI Fellow, Spring 2025.
morrisglr@proton.me
@morrisglr.bsky.social