Real World Data (RWD) is data collected outside of clinical trial study, and Real-World Evidence (RWE) could be achieved through the insight from RWD. RWD sources come from EMR, health insurance claims, genomic data, and IoT from apps and wearables. RWD anonymized patient data has revolutionized how companies view patient data since it captures longitudinal pharmacy prescription, medical claims, and diagnosis.
The paper is written for those who want to understand how RWD patient data are collected and how they could be analyzed to support pharmaceutical companies. Mainly, RWD patient data could support patient analytics, commercial analytics, and payer analytics such as source of business, switch of prescription, payment method, market analysis, promotional activities, drug launch and forecasting. The paper also discusses the technology that data scientists use for RWD such as Data Warehouse, Data Visualization, Opensource Programming, Cloud Computing, GitHub, and Machine Learning.
2. Disclaimer
The views and opinions presented here represent those of the
speaker and should not be considered to represent any
companies or organizations.
3. • What is RWD?
• Why is it important?
• Patient’s Journey using RWD
• Examples of Insights from RWD
• Challenges in RWD
• Technology requirements of
RWD
• Conclusion
4. 4
What is Real Word Data (RWD)?
Data collected outside of Clinical Trial Study
Real World Data
Clinical Trial
Data
6. 6
Real World Data vs Clinical Trial Data
Real World Data Clinical Trial Data
Population All the population Population of Eligible Criteria
Patients Number ~100 million 300 to 3,000 volunteers in Phase 3
Bias Various bias Minimized bias
Environment Real World Controlled
Visit Any Visit Scheduled Visit
Purpose Effectiveness Efficacy
Treatment Schedule As needed Fixed
Comparison All the drugs Placebo / Control
Patient Monitor As needed Continuous during Clinical Trial
Data Quality As they come Cleaned
Data Collection Various sources (e.g., EHR, Claims) EDC per protocol
Data Format Any format SAS dataset format
Data Structure Any data structure Tabular data structure
7. 7
Why is RWD important? • More comprehensive, holistic
patient journey history
• More Personalized medicine
• Safety and Efficacy findings in
bigger population
• More complete pictures on how
medicines are used in the real world
rather than controlled environment
in Clinical Trial
• Leading to better clinical trial design
(e.g., Patient recruitment, other
indication )
• Helping drug companies to manage
product life cycle (pre-launch, post-
launch, during patent, after patent)
• Helping drug companies for
marketing and sales strategies
8. 8
Patient 1001
got AC1 LAB
test on 2020-
01-15.
Patient
1001 started
“Metformin
” on 2020-
01-25.
Patient 1001
was diagnosed
with
‘prediabetes’
on 2020-01-22.
Patient 1001
got AC1 LAB
test on 2021-
01-10.
Patient 1001
was diagnosed
with ‘Diabetes,
Type 2’ on
2021-01-20.
Patient 1001
switched to
“Jardiance 10 mg”
on 2021-01-23.
Patient 1001
got AC1 LAB
test on 2021-
06-14.
Patient 1001
switched to
“Jardiance 25 mg”
on 2021-06-20.
Patient 1001 visited his family doctor.
Patient 1001 used “Aetna” insurance
Patient 1001
complained
about
“Dizziness”.
Patient 1001
got AC1 lab
result at 6.2%
on 2020-01-21
Patient 1001
got AC1 lab
result at 7.1%
on 2021-01-18
Patient 1001
got AC1 lab
result at 8.0%
on 2021-06-18
Patient 1001
got AC1 LAB
test on 2022-
01-14.
Patient 1001
got AC1 lab
result at 7.0%
on 2022-01-18
Patient’s Journey in RWD
9. 9
Data in RWD Patient’s Journey
Real World Data
(RWD)
Patient
Demograp
hics
Diagnosis
Procedures
Claims
Healthcare
Provider
Insurance
Payers
AE
LAB
Vital
• EHR
• Patient’s Medical
Activity based
• Demo, Vitals, Lab,
AE, provider’s notes
• Claims
• Patient’s Claim
Activity based
• Prescriptions,
Diagnosis,
Procedures
• Source
• Providers
• Payers
10. 10
Examples of insights from RWD (1)
Clinical Trial Optimization
• Originally, Jardiance is approved for Type 2 Diabetes treatment, but in RWD, it
has been prescribed for pre-diabetes patients.
• Using RWD, Statisticians / programmers has found Jardiance show positive
effectiveness with low safety issues.
• Using RWD, Statisticians / programmers could compare effectiveness
compared to competitors.
• Using RWD, Statisticians / programmers could find safety information of
medicines.
• Using RWD, Patients recruitments are more effective.
11. 11
Sales Performance Metrics
• Geographic Sales
Performance
• Sales Performance vs
Forecasts
• Sales Comparison with
Competitors ( e.g., 50%
market share for
Jardiance)
• Sales Performance by
quarters or month
• Source of Business
• New
• Switch
• Continued
Examples of insights from RWD (2)
12. 12
Sales Performance Metrics
GeographicSales Performance
Sales Performance vs Forecasts
Sales Comparison with Competitors
Sales Performance by quarters
Source of Business
New
Switch
Continued
Examples of insights from RWD (3)
Treatment Effectiveness /
Personalized Medicine
• 1st Line vs 2nd Line
• Combo and Mono
Therapy
• Patient Profile – Age,
Gender, Races,
Diagnosis
13. 13
Challenge in RWD
• Quality of data
• As they come, not cleaned /
validated
• High possibility of Bias
• Various data sources (e.g., Veeva
CRM, IQVIA Claim, Hospital Data)
• Unstructured / Unstandardized Data
• Big Data
• Data integration / ETL
• System Infrastructure for RWD –
Data Storage, Data Processing,
Analysis
• Return of Investment
• Usage in Clinical Trials
• Regulatory restriction
15. RWD is Big
Data
• It contains up to 5 years of ~100 million patient’s records per TA.
• ~1 trillion claims for Diagnosis, Procedure & Prescription.
• Typical Analysis for RWD contain up to ~1 trillion records.
• Tools : Cloud Computing
16. RWD in Cloud
Computing
• Big Data in RWD requires much bigger computing power, which fits
well with Cloud Computing.
• Flexible and scalable Cloud Computing environment will serve RWD
Advanced Analytics (e.g. AI, prediction, Data Visualization) well.
• An easy access for the reports and analysis thru internet connection.
• A great collaboration environment / platform for research and analysis.
• Tools : AWS, Azure, Databricks
17. RWD in Central Database /
Data Warehouse
17
• Tool to store, manage and govern RWD Data
• Security and Data Quality for Sensitive health care data
• Data Share and Access to different stakeholders
• Centralized data storage
• Tool : AWS Redshift, Snowflake, Oracle, Hive
18. RWD in Open-Source
Programming
• RWD is not a regulatory required data.
• Open-source programming for advanced analytics (e.g.
Machine Learning, Data Visualization)
• More and more organizations are expanding to open-
source programming for RWD analysis for operating cost
and advanced analysis.
• Tool : R, Python, SQL, R-Shiny
19. RWD in
Data
Visualization
• Monthly and weekly
performance metrics of the
products in market
• Interactive dashboard
implementation by filters
and graphs
• Patients Journey from
Diagnosis, EHR, to
Prescriptions
• Tool : Tableau, Power BI
19
20. RWD in Machine Learning/
Prediction
20
• Predicting the treatment effectiveness
• Clinical Trial Optimization on Study Design, Patients recruitments
• Drug Discovery on identifying potential drug targets
• Predicting personalized Medicine for patients
• Structured data from unstructured data ( e.g., Insights from
doctors’ notes)
• Tools : Jupyter, R Studio, Databricks, Machine Learning, Deep
Learning, Natural Language Procession, Convolutional Neural
Network
21. 21
Conclusion
• Real World Data could provide valuable medical and business(commercial)
insight and understanding in Real World Situation.
• RWD could bring comprehensive, holistic real world patient’s journey history
(e.g., up to 5 years or more ), leading to Personalized Medicine.
• RWD could support drug pipelines (e.g., clinical trials) and commercial drug
management (e.g., marketing and sales).
• More and more RWD will be integrated with Biometric function.
• RWD need technology investment (e.g., Big Data, Cloud Computing, Open-
Source, Machine Learning) to fully utilize its potential and benefits.