0% found this document useful (0 votes)
52 views8 pages

AI Solution for Pharma EHR Challenges

Uploaded by

vishalbackup777
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views8 pages

AI Solution for Pharma EHR Challenges

Uploaded by

vishalbackup777
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Problem

Design an AI based solution for the field of pharmaceuticals where they face
a critical challenge in data-driven decision making due to discrepancy in
EHRs from diverse sources from clinical records to genomic information.
Which in turn hinder the timely and accurate analysis of patient outcomes,
treatment effectiveness, and disease progression.

Somil Jain (IIT Roorkee)


somil_j@ar.iitr.ac.in
19110027
7339748149
Problem Stakeholders Solution Pricing Metrics & Pitfalls

Understanding the current process 60% 94% 25% 20%

EHR data influence 94% of hospital CIOs Error rates in U A in


S Data quality issues
formulary decisions reported EHR EHRs ranging from could lead to a 20%
Need to In-depth data Drug Discovery Pre Clinical
for appro imately
x interoperability as a 2. % to 2 .9% across
3 6 increase in clinical
create a drug required for research Phase Research 60% of new drugs major hurdle. various studies. trial durations.

Drug Production FDA Approval Post Trial Data Clinical Trial


Potential mpact of sol ing it
I v

starts for drug Analysis Phase


Enhancing data accuracy could reduce trial times, potentially saving up to
Pharma companies require a lot of real world sample data (EHR) for disease biology , daily for late-stage trials ( ource PRMA
$600 000 S :

research, drug discovery and identifying suited people for clinical trial. However every Clearer data improves approval rates, with every increase in approval li elihood
1% k

institution (AMCs, Insurance claims, Patient Registries, Clinical Trial and Genomic equating to an estimated million in revenue for a new drug ( ource Mc insey
$50 S : K

Databases) has a different EHR format, which causes extreme inconsistencies when Improved data quality could increase operational e ciency by , reducing the
ffi 30%

combined hence affects the analysis, which ultimately results into errors (eg. time to insight for critical decision making ( ource Deloitte)
- S :

mismatched data fields, duplicate records, date/time errors and blank fields.) A improvement in data management could save up to
10% million from the
$200

R D budget ( ource PRMA


& S :

Improved data could e pedite patient access to new treatments by months,


R easons for inconsistenc in y EHR
x

benefiting patient health and company revenues ( ource IM Health).


S : S
3-6

Heterogeneous Data Sources: Healthcare data comes from various sources, each
source has it s own method for recording and utilising information.

A ssumptions
Lac of Standardi ation: There is no universal standard for EHR, which results in
k z Data Availability: We have access to EHR data, despite its fragmented nature, is
di erent formats and terminologies being used across di erent platforms.
ff ff largely obtainable through partnerships or e isting public database & APIs
x

anual Data Entry: Human error in manual data entry is responsible for mistakes Q uality of Source Data: The source EHR data, while imperfect, contains sufficient
detail to be enhanced through AI driven curation processes
M

occurring in the form of typos, omissions, or misinterpretations. -

ompliance easibility Regulatory bodies like HIPAA will support the use and
Temporal Discrepancies: Changes in patient conditions over time and delays in
C F :

interoperability of EHR in drug development, with a focus on data quality and


updating records can lead to outdated information being present in the EHRs. patient privacy using data masking & de identification.
-
Problem Stakeholders Solution Pricing Metrics & Pitfalls

Pharma Companies Pharma Company Persona


Scale of arge Tier e i m Tier Small Tier
For now, we will only focus on large
L M du

and medium scale companies


company (≈ 50) (≈ 200) (≈ 1550) Global Pharma Inc.
because of their more RND budget Multinational Pharmaceutical Company
and shown extensive readiness to Avg revenue/
year $40 Billion /yr $10 Billion /yr $1 Billion /yr Headquarters: New Jersey, USA
invest in AI-driven solutions. Research Budget: $8 Billion annually

PHARMA COMPANY
Speciality: Oncology, Cardiovascular, Neurology,
EHR Provides RND budget 15% of revenue

≈ $6 B illion
10% of revenue

≈ $1 B illion
5% of revenue

≈ $50 Million Immunology, and Infectious Diseases


For now we will only consider AMCs
(cost effective & data harmonization Budget for data & 5% of RND
5% of RND
5% of RND

Pain Points
because AMCs have multiple it’s management ≈ $300 M illion ≈ $50 M illion ≈ $2.5 Million Data Fragmentation: Struggles with
branches with same EHR format), disparate EHR data formats from various
Government Public Data (extensive Market not sources, complicating R&D efforts.
freely available data) and Clinical penetrated 50% 40% 20%
Operational Inefficiencies: Data
Trial Data (optional integration in inconsistencies lead to extensive manual
case of a similar research) To tal Market Size 50*300*0.5

illion
200*50*0.4

illion
1550*2.5*0.2

≈ $600 Million
data cleaning, increasing operational
costs and resource allocation.
≈ $7.5 B ≈ $4 B

Regulatory Hurdles: Encounters


A cademic Insurance Community Clinical Trial Genomic Govt. Public difficulties ensuring compliance hence
e ical Centers
M d Companies Hospita s l Databases Databases Databases have to deal with delayed drug approvals
(≈ 120) (≈ 25) (≈ 6000) (≈ 500*) (≈ 50) (≈ 100) due to data quality
Provider

te si e data of Detailed records of High volume data High-quality Complex data for
N eeds
Ex n v
cli ical and
n healthcare services of diverse & structured data disease biology Freely available
researched data
S tandardized Data: An interoperable
research activities. billed and paid specialized facility from trial research system for EHR data that ensures data
EHR

compliance with regulations eg. HIPA


Cost = Advanced Analytics:AI analytics tools for
Cost= $500,000/yr Cost=$1,000,000/yr Cost= $10,000/yr Cost= $500,000/trial Free/ very low
$1,000,000/db/yr
predictive modeling and simulation to
identify potential drug
Note: There are thousands of Clinical Trial Databases out there in US, however only few of them are willing to sell their clinical trial research data
Problem Stakeholders Solution Pricing Metrics & Pitfalls
e can use Named Entity Recognition (NER) models, like BERT to
Introducing M-Intel De-identification: W
identify and redact personal identifiers from datasets to prevent identification of
individuals. eg system recognizes the patient's name & DOB, and replaces them with
Get the relevant raw Harmonize and Analyze the data anonymous identifiers like name with "Patient 12345" and change DOB to just the year.
EHRs for research Standardized EHRs during research Data Cleaning: L earning model like DBSCAN can be trained to identify anomalies or
inaccuracies correcting them on basis of other attributes to improve quality. eg. a weight
MIntel provides service to both pharma companies who don’t have the raw EHRs entry is listed as 6500 kgs, system recognizes this as an obvious error, and corrects it
for research to begin with and companies who have EHRs from multiple source but on basis of age, height etc and also flags it system generated value.
the data can not be merged because it is not harmonized or present in a standard Data Harmonization and Normalization: S emantic AI can be used to understand the
format. In addition to this, it also equipped with semantic search model to help context and meaning behind different data representations and map them to a
researchers to identify patterns in the process of developing a new drug. standardized format. eg. One hospital records blood pressure as "BP" while another
Providing relevant harmonized data for research records it as "Blood Pressure" then our system would detect this and map it to our
decided schema
Clients come to us We look into our inalize a monetary deal
F Pipeline Creation: Our overall database would have millions of entries and navigating
with their query of a onized data for for X entries of data
required EHR
h rm
client needs matching requirements
through them is a time taking process, in pipeline creation a separate schema will be
made as per the client requirement, which we save time as well as resources.
This will majorly target medium sized pharma companies because they often have Various app backend and frontend: Backend with access to filtered data by the
limited resources compared to large companies and by out-sourcing data acquisition pipeline will be the server-side of the client’s app dealing with data management, role
and harmonization through a same service will save cost and complexity of management, permission handling and processing which will power user interface and
maintaining a large scale data harmonizaion operation experience components of an app, through which users interact.

EHR from AMC 1 Di-Identification Data Cleaning ackend


App 1 B App 1 Frontend

EHR from AMC n Di-Identification Data Cleaning Data i eline Creation for
Pp
App 2 Backend App 2 Frontend
Harmonization different apps
Publicly available data Data Cleaning App “N” Backend App “N” Frontend
Problem Stakeholders Solution Pricing Metrics & Pitfalls
Ingesting their own EHR data to be harmonized MIntel EHR Directories Search Product

It allows pharma companies to upload their own EHR data (encrypted at both
end), with option of data processing. With minimal human input, it ensures privacy, Dashboard

accuracy, and standardized data ready for analysis, saving time and enhancing
research quality. This feature is ideal for large-sized pharma firms those who have
Cohorts
MGH Mayo Clinic UCSF Medical Centre Mount Sinai Hospi UCLA
21.5K patients 21.5K patients 21.5K patients 21.5K patients 21.5K patients

their own data sources and do not trust other organization with it. Reports

EHR Management

Notes

Login to Admin Decide data


Stanford Medicine Add EHF data

EHR upload Data Processing


100 GB

portal access
Calendar

Login to Admin Portal: A multi-factor authorization portal designed for admins, where they
would be able to manage EHRs, Reports, Cohorts and can also access project’s activities,
including recent uploads, processing status, and access logs.
Account setup for IRN generation

MIntel EHR Directories Search Product

EHR Upload: Within the admin portal, there is a dedicated section for EHR data upload.
Admins can upload files from their system or can connect with external APIs to fetch
Upload Data De-Identification Data Cleaning Harmonization

Dashboard

hosted data. The portal accepts various file formats and sizes, supporting the diverse
types of EHR data that might be collected from different sources. Cohorts
MGH
21.5K patients
Mayo Clinic
21.5K patients
UCSF Medical Centre
21.5K patients
Mount Sinai Hospi
21.5K patients
UCLA
21.5K patients

Reports

30%

Data Processing: After the EHR data is uploaded, it enters the AI-driven processing stage.
EHR Management

This involves several automated steps including: De-Identification, Data Cleaning and Notes 1222/100084

Files Uploaded

Harmonization. Users can track and intervene the progress of these processes through the
Stanford Medicine Add EHF data
100 GB
Calendar

admin portal and receive notifications upon completion.

Decide Data Permissions: Once data processing is complete, users can set permissions
for who within their organization can access the data on basis of roles.
Continue
Problem Stakeholders Solution Pricing Metrics & Pitfalls
Smart search to convert natural questions to DB queries Patient Cohort creation for research
Natural AI Query Conversion to Data Retrieval
Add more attributes Cohort Creation
Collaboration and
Language Input Interpretation Database Query and Visualization Sharing

This feature provides an intuitive way for users to interact with complex databases
using simple English queries eg. “Show me the average age of patients with MIntel Cohorts John Doe
Admin

hypertension from the 2020 dataset.", behind the scene search engine uses
models like BERT or GPT-3.5 (trained on medical and pharmaceutical text) to Dashboard Hypertension X Asthma
5 Nov 2023
Hypertension X Age
5 Nov 2023
Cardiovascular Risk
5 Nov 2023

understand the context and intent behind the user’s natural language and convert Cohorts 1200 View All 8900 View All 6000 View All

it into a SQL query, which in turn process (categorization) & fetch the data. The Reports Cancer Survivor
Age

Medication
Medication Responders

results are presented to the user in an easily digestible format, typically as graphs,
5 Nov 2023 5 Nov 2023

EHR Management 16400 View All Geographical Region 2400 View All

charts, or tables eg. a bar graph showing the avg age of hypertension patients Notes Hypertension X Covid19
5 Nov 2023
Haemoglobin Age

Medication
Hypertension X Geriatric
Calendar
70000 View All 5 Nov 2023
Ethinicity
800 View All 5 Nov 2023

102500 View All

It offers a streamlined process for users to define and analyze specific patient
populations within their healthcare data. Users selects adds various attributes, such
as age and diagnosis, to set up the criteria for their cohort. The system then
dynamically generates a matching patient segment with their fine-tune parameters.

Cohort creation allows for the saving and sharing of it, facilitating collaborative
research while adhering to privacy standards. With AI-driven suggestions and real-
time data processing, this feature simplifies complex data analysis, making it
Add Cohort
accessible and efficient for all user levels within a pharmaceutical organization.
Problem Stakeholders Solution Pricing Metrics & Pitfalls

As a part of pricing GTM, our primary focus is find a sweet point between recovering our initial costs and it should be affordable to since we would be entering a
new market. According to the strategy, we need to focus on gaining market share and increasing our revenue through sales.
Pricing Strategy

Total costs
Fixed Costs Variable Costs
Software AI model Employee Infrastructure EHR Server Monthly Marketing &
Licenses development cost cost Outsourcing (scalability) Cost variable cost Sales

1. Price Skimming: The price is 2. Penetration pricing: It is just 3. Value based pricing: This 4. Competitive pricing: When 5. Cost plus pricing: In this type
set for high-paying customers, opposite of price skimming, pricing is based on customers there are a lot of competitors, it of pricing, the Product is priced
Exploring

and then lowered over a period where initially price is set lower perceived value of the product is better to set prices at a lower on the basis of the cost. Simply
of time. This gives higher return and then increased gradually rather than the actual cost of side to prevent customers to go some percentage is added over
on investment. over time. the product. to our competitors. the cost.

Subscription-Based Model
Enterprise Custom Contracts
Freemium Model

(Client using our generic EHR database) (Client with specific EHR requirements) (Requirement of only EHR Harmonization)
Value Based Pricing

Tiered Access: Different subscription tiers offering Bespoke Solutions: Custom pricing for large Basic Features for Free: Limited access to the
varying levels of access and features, such as the pharmaceutical companies requiring extensive use platform for free, allowing users to upload data and
number of user accounts, volume of data storage, of the platform, including integration with existing run a certain number of queries or create a limited
and processing power for data analysis systems, custom features, or dedicated support number of cohorts
Flexible Plans: Monthly and annual subscription Volume Discounts: Reduced rates for high-volume Premium Features: Advanced features, such as
options to provide flexibility and encourage long- data purchase or for companies that commit to a more complex queries, larger data uploads, or
term commitment. certain level of usage. additional cohort analyses, available for a fee.
Problem Stakeholders Solution - I Solution - II Metrics & Pitfalls

Acquisition Activation Retention Engagement Monetization


#number of customers
User Onboarding Completion User Onboarding Completion Avg Session Duration Average Revenue per User
Metrics to look

acquired Rate Rate Conversion Free trial to paid


# umber Query per Session (
Monthly Active users
N

Cost Per Acquisition Successful First Query customers


# umber of Cohort Created
)

Successful First Cohort Weekly Active users N

per customer
Creation Churn Rate

Pitfalls and Mitigations W hat more can be done ?

Potential Pitfalls H w o to reduce risk


? Clinical Trial Matching Module: We can develop an advanced system that uses AI to

match patients with ongoing clinical trials based on their health data, potentially

Mishandling sensitive data can lead to Implement robust encryption access


, accelerating recruitment and improving trial diversity.

breaches in regulations like HIPAA. controls and regular audits to de-


,

identi cation techniques


fi Decentralized Data Ecosystems: Blockchain can be utilized to create a decentralized

AI algorithms might be leading to the stablish rigorous data cleaning and


data exchange where researchers can share data securely and patients can control who
E

inaccurate analysis. validation processes. Continuously


accesses their information.

monitor data quality and retrain AI Global Regulatory Compliance: Currently I have considered only US market, however

Over-reliance on AI could lead to E ducate users on the strengths and we can extend the product in such a way that it would automatically adapt data

potentially overlooking errors. limitations of AI. Implement checks handling and privacy measures to comply with global regulations, not just HIPAA.

and balances for human veri cation


fi
Patient Engagement Tools:

Regulations governing HR data are ngage ith legal e perts to


These modules will allow for direct patient engagement

E E w x and data collection, such as digital consent forms, patient-reported outcomes and ata

sub ect to change leave the platform


j , anticipate and respond to regulatory collection capabilities to include data from wearable devices and Internet of Things

non-compliant. shifts proactively. (IoT) healthcare applications.

You might also like