SlideShare a Scribd company logo
1 of 60
D A T A S C I E N C E I N H E A L T H C A R E
T H E U N I V E R S I T Y M A L A Y A M E D I C A L
C E N T R E
B R E A S T C A N C E R E C O S Y S T E M
A S S O C . P R O F D R S A R I N D E R K . D H I L L O N
D ATA S C I E N C E & B I O I N F O R M AT I C S L A B
U N I V E R S I T Y O F M A L AYA , K U A L A L U M P U R , M A L AY S I A
S A R I N D E R @ U M . E D U . M Y
H T T P : / / S A R I N D E R K A U R . C O M
H T T P S : / / U M E X P E R T . U M . E D U . M Y / S A R I N D E R
Machine
Learning
Knowledge
graph
Deep
Learning
Statistics
Big data
Cloud
computing
Predictive
analytics
Data
mining
Text mining
Artificial
Intelligence
Image
Processing
noSQL
DATA
SCIENCE
Data Science in Healthcare
Transforming healthcare Data is used effectively Personalised medicine Predicting factors for
improved decision
making
Historical data use in
analytics
Artificial enabled
systems
Electronic medical
records/Electronic
health records
F O C U S O F TA L K
U N I V E R S I T Y M A L AYA M E D I C A L
C E N T R E ( U M M C )
D ATA S C I E N C E H E A LT H C A R E
E C O S Y S T E M
UMMC
Data Science Healthcare Ecosystem
Machine Learning for
Prediction
MyBCC ( Malaysian Breast Cancer
Cohort Study)
Deep Learning for
Classification
Text Mining in Radiology
UMMC Breast cancer clinical data
&
Electronic Medical Records
DEVELOPMENT OF ELECTRONIC
MEDICAL RECORD FOR
CLINICAL & RESEARCH PURPOSES:
B R E A S T C A N C E R C L I N I C A L D ATA M A N A G E M E N T
U S I N G P O I N T- O F - C A R E A N D M U LT I D I S C I P L I N A R Y
D ATA C O L L E C T I O N
FIRST PROJECT
Bernama (2018, November 8). Health Minister: EMR system to be implemented at 145 hospitals within the next three years. The Star.
Brar, K. (2018, January 31). Aiding healthcare through data analytics. The Star.
“The new technology would facilitate medical practitioners,
including doctors and nurses to identify and share patients' medical
consultation information, as well as prescription of medicines
through an integrated system.”
“With Big Data, healthcare organisations have the ability of
information exchange, leading to a 360° view of their
patients, so doctors can give a more complete diagnosis.”
In the News
Common Scenario: Limitations with clinical data management
O B J E C T I V E
To enhance current i-Pesakit© system and achieve interoperability by consolidating
and automating manual retrieval processes of surgical, oncology, radiology, pathology
and pharmacy data into mineable data for clinical use, secondary data usage in audits
and breast cancer research activities (iResearch)
CLINIC VISITS
CLINICAL AUDIT &
REPORTING
DIAGNOSIS & TREATMENT
CLINICAL RESEARCH
Interdisciplinary Research Concept
Bridging the gap between research & clinical practice
Architecture System of the UMMC i-Pesakit© Breast Cancer Module
I-pesakit© Breast Cancer Module During Weekly MDT Meeting
System Content Features and Implementation into Clinical Workflow
i-Pesakit Breast Cancer Chemotherapy e-Prescription
Clinicians’ Perception towards the System
Usage of i-Pesakit© Breast Cancer Module in Clinical Work Is this system worth the time and effort to be used?
Effect on Clinical Workflow by Using i-Pesakit© Breast Cancer Module Among Clinicians
• Enhanced EMR provides point-of-care data in
clinics, wards and MDT meetings
• Ability to produce reports on breast cancer
survival in Malaysia for research, individual
hospital performance for policymakers to track
outcomes and provide direction in cancer
control.
• A time effective approach while producing new
knowledge through data mining and analysis –
Research & Development
Alternate Clinical Data Management Using Point-of-care
and Multidisciplinary Data Collection
Nor, N. A. M., Taib, N. A., Saad, M., Zaini, H. S., Ahmad, Z., Ahmad, Y., & Dhillon, S. K. (2018). Development of electronic medical records for clinical and
research purposes : the breast cancer module using an implementation framework in a middle income country- Malaysia. BMC Bioinformatics, 19(Suppl 13), 1–
16. http://doi.org/10.1186/s12859-018-2406-9) (ISI-Indexed)
PUBLICATION IN BMC
BIOINFORMATICS
SECOND PROJECT
SECOND PROJECT
Application of machine learning to predict the important clinical prognostic
factors affecting survival rate of breast cancer
8066 samples from
University Malaya Medical
Centre (UMMC)
(1993-2018)
23 clinical factors
1 target variable (Life
status)
METHODOLOGY – Supervised learning (SL)
MACHIN
E
LEARNI
NG
PIPELI
NE
1. Model Evaluation
2. Random Forest Further Modelling
3. Variable Selection
4. Decision Tree
5. Survival Curves Validation
RESULTS – SL – Model evaluation
No Algorithm Accuracy
(%)
Sensitivity Specificity AUC Precision Matthews
correlation
coefficient
1 Decision tree 79.80 0.82 0.75 0.72 0.91 0.52
2 Random
forest
82.70 0.83 0.81 0.86 0.93 0.59
3 Neural
networks
82.00 0.83 0.79 0.84 0.93 0.58
4 Extreme
boost
81.70 0.84 0.75 0.87 0.89 0.57
5 Logistic
regression
81.10 0.82 0.78 0.85 0.92 0.55
6 Support
vector
machine
81.80 0.81 0.84 0.85 0.95 0.57
CALIBRATION ANALYSIS
RESULTS – SL – Variable importance
PART 1 RESULTS – SL – Variable importance
PART 1 RESULTS – SL – Important variables
RESULTS – SL – Decision tree
Decision tree for all data; Shows that patients with curable cancer, ≤ 1 positive lymph nodes (PLN) and ≤ 2 total axillary
lymph nodes removed (TLN) had 50% survival probability, while patients with pre-cancer, ≤ 1 PLN and ≤ 2 TLN had
90% survival probability. Patients with metastatic cancer, > 6 PLN and > 6 TLN had only 25% survival probability
PART 1 RESULTS – SL – Survival curves
VALIDATION USING AJCC MANUAL 5th edition
Variables AJCC Machine learning
Classificat
ion of
tumor
size
< 2.0cm
1.0 < 4.0cm
>4.0cm
<2.5cm
2.5 < 4.8cm
4.8 < 11.0cm
>11.0cm
Number
of
positive
lymph
nodes
≤ 3
3 < 6
> 6
< 3
3 < 9
> 9
SECOND PROJECT
Application of machine learning to predict the important clinical prognostic
factors affecting survival rate of breast cancer
8066 samples from
University Malaya Medical
Centre (UMMC)
(1993-2018)
23 clinical factors
1 target variable (Life
status)
METHODOLOGY – Unsupervised learning (UL)
• Using Gap Statistics
method, factoextra
library in R
• Labels of variables
suggest the value
of K (number of
cluster)
• Run K-means and
record distance, V
• V minimizes when
K = n
• Visualize through
scree plot
1. Determine
optimal number
of cluster
• Compare
hierarchical, k-means
and PAM
(Partitioning around
Medoids)) using
clValid R library
• Measured using
connectivity, Dunn
and Silhouette
2. Determine
most suitable
clustering
method
• Variables
subdivided into
groups by cutting at
a desired similarity
level
• Function dist() in
hclust library was
used to calculate a
dissimilarity matrix
as an input
• Dendrogram was
visualise using
fviz_dend() function
in factoextra library
3. Perform
hierarchical
clustering
• Clinicians validate
the patterns of
variables in each
cluster
• Results to be
compared with real-
time survival analysis
4. Validate the
pattern of
clusters
RESULTS – UL
Optimal number of cluster = 6
Suitable clustering method=
hierarchical clustering
METHOD CONNECTIV
I-TY
DUNN SILHOUETT
E
hierarchica
l
16.45 0.72 0.38
kmeans 16.34 0.68 0.38
pam 20.70 0.34 0.16
RESULTS – UL – Hierarchical clustering
V1: Age
V22: Total lymph nodes
V21: Hormonal therapy
V12: ER status
V13: PR status
V18: Method of axillary lymph node
dissection
V2: Marital status
V7: Classification of breast cancer
V16: Surgery status
V19: Radiotherapy
V20: Chemotherapy
V3: Menopausal status
V8: Laterality
V6: Method of diagnosis
V15: Primary treatment type
V10: Grade of differentiation
V14: cerb2 status
V17: Type of surgery
V5: Race
V4: Presence of family history
V9: Cancer stage
V11: Tumor size
V23: Positive lymph nodes
PUBLICATION
PART 1 REPOSITORY
D E E P L E A R N I N G
A P P R O A C H E S :
L E S I O N
C L A S S I F I C AT I O N
THIRD PROJECT
Classification of benign and malignant lesions in breast
ultrasound images:
Benign?
Or
Malignant?
DATASET
• Collected from UMMC between 2012-2013
• 83 Patients
• Size of images: 1400 X1050
• 140 Benign and 136 Malignant images(Biopsy
Confirmed)
Pre-Processing
• Augmentation
 Rotation
 Shift
 Zoom
 Flip
 Shear
• Zero-Center Data
• Normalize data
TRADITIONALLY
:
Handcrafted Features Machine Learning
Prediction
Benign
Malignant
OUR STUDY
Deep Learning
Transfer Learning
Feature
Extraction
VGG16 Fine
Tuning
ResNet50 Fine
Tuning
VGG-FC
VGG-SVM
VGG-DT
VGG-RF
VGG-AB
ResNet-FC
ResNet-SVM
ResNet-DT
ResNet-RF
ResNet-AB
• We use VGG16 and ResNet50 pre-trained on Breast US image dataset
• We used Convolutional layers of VGG16 and ResNet50 for feature extraction
• 12 Deep Learning models that we used in our study
ROC of different models
References Dataset Deep learning Models Performance
[29]
4254 benign
GoogLeNet
Accuracy: 91.23%
3154 malignant Sensitivity: 84.29%
Specificity: 96.07%
[30]
135 benign
Boltzmann
Accuracy: 93.4%
92 malignant Sensitivity: 88.6%
Specificity: 97.1%
[31]
100 benign
Deep Polynomial
network+SVM
Accuracy: 92.40%
100 malignant Sensitivity: 92.67%
Specificity: 91.36%
[32]
275 benign
Stacked denoising
Autoencoder
Accuracy: 82.4%
245 malignant Sensitivity: 78.7%
Specificity: 85.7%
Current Study
249 benign
Attention VGG16 +
ensembled loss
Accuracy: 93%
190 malignant Sensitivity: 96%
Specificity: 90%
THE STATE OF THE ART OF DEEP LEARNING MODELS IN BREAST ULTRASOUND LESION CLASSIFICATION.
FOURTH PROJECT OBJECTIVE
To develop a
fully automated AI-enabled database platform
with interactive visualisations
for researchers and clinicians
using The Malaysian Breast Cancer Survivorship Cohort (MyBCC) Study
Baseline
6 Months
1 Year
3 Years
5 Years
5 Timelines
603
Variables
909
Samples
May 2019
FOURTH PROJECT MYBCC
Baseline
6 Months
1 Year
3 Years
5 Years
603
Variables
800
Patients
FOURTH PROJECT METHODOLOGY
FOURTH PROJECT METHODOLOGY – Automated machine
learning
FOURTH PROJECT– Automated machine learning
Algorithm 1. Cartesian product to select lifestyle and clinical factors from different tables
Select 13 lifestyle factors, life status and survival years from table, mybcc and 4 clinical factors from table, clinical
where the mybcc.RN = clinical.RN (select samepatient ID from both tables)
r = 𝝈mybcc.RN = clinical.RN ((πRN,l1,l2,l3…,l13,lifestatus,survivalyears (mybcc)) × (πRN,c1,c2,c3.c4 (clinical)))
Definition:
r = relational database
𝝈 = selection
Π = projection
× = Cartesian product
mybcc = data table, which contains lifestyle factors
clinical = data table, which contains clinical factors
RN = patient ID/ primary key in both tables
lifestatus = life status of the patients (Alive/Dead)
survivalyears = Overall survival years of the patients
l1,l2,l3…l13 = 13 lifestyle factors
c1,c2,c3,c4 = 4 clinical factors
FOURTH PROJECT– Automated machine learning
Algorithm 2. Python-HTML integration for automated machine learning
a = query1 + ((pm1,pm2,…pmn)+ ps + ph)
Definition:
a = automated analysis
query1 = 𝝈mybcc.RN = clinical.RN
((πRN,l1,l2,l3…,l14,lifestatus,survivalyears (mybcc)) ×
(πRN,c1,c2,c3.c4 (clinical) (Refer to Algorithm 1)
pm1,pm2,…pmn = Pyhton modules
ps = Python script to run each analysis
ph = Python-HTML connection via cgitb
FOURTH PROJECT– Automated machine learning
Algorithm 3. Model of the automated visualisation from database
v = query2 + (c1,c2,…,cn)
Definition:
v = visualisation
query2 = query1 + ((pm1,pm2,…pmn) + ps + ph)
(Refer to Algorithm 2)
c1,c2,…,cn = different types of charts
FOURTH PROJECT RESULTS – Digitized questionnaire
FOURTH PROJECT– Example of automated machine
learning
FOURTH PROJECT– Automated quality of life scoring
FOURTH PROJECT– Interactive Visualisations
FOURTH PROJECT– Reporting/download
Other visualizations in progress
FIFTH PROJECT TEXT MINING
.in+0=========== REPORT TEXT ==========.br.br.br.br.br.br.brMRI THORACOLUMBAR +C of
13-Oct-2015:.br.brIndication.brMetastatic breast carcinoma. Currently complaint of weakness of the lower limb
bilaterally. TRO cord involvement/ compression..br.brSequences.brCoronal T1W, T2W.brSagittal T1W,T2W,
CISS3D.brAxial CISS3D, T2W.brPost gad- sag, axial.br.brFindings.brCorrelation made with previous CT dated
8.6.2015..br.brNormal spinal alignment..brMultilevel enhancing high signal intensity on T1/T2W/STIR is seen in
the thoracic and lumbar vertebral bodies, sacrum, both ilium and sternum..brReduced T12 vertebral body
height..brThey are expansion of the vertebral body at T3, T6, T9, T10 and T12 levels causing indentation of the
spinal cord worst at the T12 level with AP diameter of the spinal canal measures 0.8cm. .brThe rest of vertebral
body heights are preserved..brThe intervertebral disc returns normal signal..brThe spinal cord ends at L1. No
intramedullary lesion seen or high signals seen in spinal cord at T2WI. .br.brImpression.brBreast carcinoma
with .br1. Multiple vertebral body metastasis with multilevel spinal cord indentation..br2. Pathological
compression fracture and spinal canal stenosis at T12 level .br.br.brDrs Vithya / Nur Aishah / DR
NAZRI.br.br.br.br.br.br.brReport Written By: Dr MOHAMMAD NAZRI BIN MD. SHAH? 13-OCT-
2015 05:05 PM.brReport Approved By : Dr MOHAMMAD NAZRI BIN MD. SHAH? 13-OCT-2015 05:05
PM.br.br.br========== REPORT TEXT END =========.br.in-0
WORKFLOW
mmo
Towards an
AI Enabled
Digital
Platform for
Modern
Cancer
Healthcare
A Fully
Automated EMR
for BCM
(iPesakit BCM)
AI Enabled
Breast Cancer
Research EMR
AI Powered
Sentiment
Analysis
Research
Databases
1
2
3 Public
Opinio
n
Mobile Application
for E-
Communication
between doctors,
nurses and patients
4
Monitoring & Predictions
Summary
• Data Science Is Reshaping Healthcare
• Techniques Such As Machine Learning, Deep Learning, Text Mining Are The Core of
Data Science
• UMMC Is One of The Primary Hospital In The Country To Embark On Data Science
Projects
• The Data Science Pipeline Produced And Tested In These Projects Will Be Used In
Other Asian Hospitals
THANK YOU FOR LISTENING
Professor. Dr. Nur Aishah Binti Mohd Taib, UM,
Dr. Nurul Aqilah Mohd Nor, Bioinformatics, UM,
Dr. Tania Islam, MyBCC Project Manager, UM,
Professor Pietro Lio, Adviser, University of Cambridge, UK,
Tan Wee Ming, Programmer, student, Bioinformatics, UM,
Dr. Elham Yousef Kalafi, Researcher, Bioinformatics, UM
Mogana Darshini Ganggayah, student, Bioinformatics, UM

More Related Content

Similar to Data Science in Healthcare -The University Malaya Medical Centre Breast Cancer Ecosystem

A Progressive Review: Early Stage Breast Cancer Detection using Ultrasound Im...
A Progressive Review: Early Stage Breast Cancer Detection using Ultrasound Im...A Progressive Review: Early Stage Breast Cancer Detection using Ultrasound Im...
A Progressive Review: Early Stage Breast Cancer Detection using Ultrasound Im...IRJET Journal
 
IRJET - A Conceptual Method for Breast Tumor Classification using SHAP Values ...
IRJET - A Conceptual Method for Breast Tumor Classification using SHAP Values ...IRJET - A Conceptual Method for Breast Tumor Classification using SHAP Values ...
IRJET - A Conceptual Method for Breast Tumor Classification using SHAP Values ...IRJET Journal
 
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...IRJET Journal
 
Performance and Evaluation of Data Mining Techniques in Cancer Diagnosis
Performance and Evaluation of Data Mining Techniques in Cancer DiagnosisPerformance and Evaluation of Data Mining Techniques in Cancer Diagnosis
Performance and Evaluation of Data Mining Techniques in Cancer DiagnosisIOSR Journals
 
A Progressive Review on Early Stage Breast Cancer Detection
A Progressive Review on Early Stage Breast Cancer DetectionA Progressive Review on Early Stage Breast Cancer Detection
A Progressive Review on Early Stage Breast Cancer DetectionIRJET Journal
 
Hybrid filtering methods for feature selection in high-dimensional cancer data
Hybrid filtering methods for feature selection in high-dimensional cancer dataHybrid filtering methods for feature selection in high-dimensional cancer data
Hybrid filtering methods for feature selection in high-dimensional cancer dataIJECEIAES
 
Detection of Lung Cancer using SVM Classification
Detection of Lung Cancer using SVM ClassificationDetection of Lung Cancer using SVM Classification
Detection of Lung Cancer using SVM ClassificationIRJET Journal
 
Breast Cancer Stage Classification on Digital Mammogram Images
Breast Cancer Stage Classification on Digital Mammogram ImagesBreast Cancer Stage Classification on Digital Mammogram Images
Breast Cancer Stage Classification on Digital Mammogram ImagesIJCSIS Research Publications
 
i.a.Preoperative ovarian cancer diagnosis using neuro fuzzy approach
i.a.Preoperative ovarian cancer diagnosis using neuro fuzzy approachi.a.Preoperative ovarian cancer diagnosis using neuro fuzzy approach
i.a.Preoperative ovarian cancer diagnosis using neuro fuzzy approachJonathan Josue Cid Galiot
 
IRJET- Oral Cancer Detection using Machine Learning
IRJET- Oral Cancer Detection using Machine LearningIRJET- Oral Cancer Detection using Machine Learning
IRJET- Oral Cancer Detection using Machine LearningIRJET Journal
 
The Evolution and Impact of Medical Science Journals in Advancing Healthcare
The Evolution and Impact of Medical Science Journals in Advancing HealthcareThe Evolution and Impact of Medical Science Journals in Advancing Healthcare
The Evolution and Impact of Medical Science Journals in Advancing Healthcaresana473753
 
journals on medical
journals on medicaljournals on medical
journals on medicalsana473753
 
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASEUSING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASEIJCSEIT Journal
 
Breast Cancer Prediction
Breast Cancer PredictionBreast Cancer Prediction
Breast Cancer PredictionIRJET Journal
 
Early detection of breast cancer using mammography images and software engine...
Early detection of breast cancer using mammography images and software engine...Early detection of breast cancer using mammography images and software engine...
Early detection of breast cancer using mammography images and software engine...TELKOMNIKA JOURNAL
 
IRJET- Disease Prediction using Machine Learning
IRJET-  	  Disease Prediction using Machine LearningIRJET-  	  Disease Prediction using Machine Learning
IRJET- Disease Prediction using Machine LearningIRJET Journal
 
Artificial Intelligence in pathology
Artificial Intelligence in pathologyArtificial Intelligence in pathology
Artificial Intelligence in pathologynehaSingh1543
 
Enabling Translational Medicine with e-Science
Enabling Translational Medicine with e-ScienceEnabling Translational Medicine with e-Science
Enabling Translational Medicine with e-ScienceOla Spjuth
 

Similar to Data Science in Healthcare -The University Malaya Medical Centre Breast Cancer Ecosystem (20)

A Progressive Review: Early Stage Breast Cancer Detection using Ultrasound Im...
A Progressive Review: Early Stage Breast Cancer Detection using Ultrasound Im...A Progressive Review: Early Stage Breast Cancer Detection using Ultrasound Im...
A Progressive Review: Early Stage Breast Cancer Detection using Ultrasound Im...
 
IRJET - A Conceptual Method for Breast Tumor Classification using SHAP Values ...
IRJET - A Conceptual Method for Breast Tumor Classification using SHAP Values ...IRJET - A Conceptual Method for Breast Tumor Classification using SHAP Values ...
IRJET - A Conceptual Method for Breast Tumor Classification using SHAP Values ...
 
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...
 
Performance and Evaluation of Data Mining Techniques in Cancer Diagnosis
Performance and Evaluation of Data Mining Techniques in Cancer DiagnosisPerformance and Evaluation of Data Mining Techniques in Cancer Diagnosis
Performance and Evaluation of Data Mining Techniques in Cancer Diagnosis
 
A Progressive Review on Early Stage Breast Cancer Detection
A Progressive Review on Early Stage Breast Cancer DetectionA Progressive Review on Early Stage Breast Cancer Detection
A Progressive Review on Early Stage Breast Cancer Detection
 
Hybrid filtering methods for feature selection in high-dimensional cancer data
Hybrid filtering methods for feature selection in high-dimensional cancer dataHybrid filtering methods for feature selection in high-dimensional cancer data
Hybrid filtering methods for feature selection in high-dimensional cancer data
 
Detection of Lung Cancer using SVM Classification
Detection of Lung Cancer using SVM ClassificationDetection of Lung Cancer using SVM Classification
Detection of Lung Cancer using SVM Classification
 
AI in Gynaec Onco
AI in Gynaec OncoAI in Gynaec Onco
AI in Gynaec Onco
 
Breast Cancer Stage Classification on Digital Mammogram Images
Breast Cancer Stage Classification on Digital Mammogram ImagesBreast Cancer Stage Classification on Digital Mammogram Images
Breast Cancer Stage Classification on Digital Mammogram Images
 
i.a.Preoperative ovarian cancer diagnosis using neuro fuzzy approach
i.a.Preoperative ovarian cancer diagnosis using neuro fuzzy approachi.a.Preoperative ovarian cancer diagnosis using neuro fuzzy approach
i.a.Preoperative ovarian cancer diagnosis using neuro fuzzy approach
 
IRJET- Oral Cancer Detection using Machine Learning
IRJET- Oral Cancer Detection using Machine LearningIRJET- Oral Cancer Detection using Machine Learning
IRJET- Oral Cancer Detection using Machine Learning
 
The Evolution and Impact of Medical Science Journals in Advancing Healthcare
The Evolution and Impact of Medical Science Journals in Advancing HealthcareThe Evolution and Impact of Medical Science Journals in Advancing Healthcare
The Evolution and Impact of Medical Science Journals in Advancing Healthcare
 
journals on medical
journals on medicaljournals on medical
journals on medical
 
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASEUSING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
 
Breast Cancer Prediction
Breast Cancer PredictionBreast Cancer Prediction
Breast Cancer Prediction
 
Early detection of breast cancer using mammography images and software engine...
Early detection of breast cancer using mammography images and software engine...Early detection of breast cancer using mammography images and software engine...
Early detection of breast cancer using mammography images and software engine...
 
IRJET- Disease Prediction using Machine Learning
IRJET-  	  Disease Prediction using Machine LearningIRJET-  	  Disease Prediction using Machine Learning
IRJET- Disease Prediction using Machine Learning
 
Artificial Intelligence in pathology
Artificial Intelligence in pathologyArtificial Intelligence in pathology
Artificial Intelligence in pathology
 
Enabling Translational Medicine with e-Science
Enabling Translational Medicine with e-ScienceEnabling Translational Medicine with e-Science
Enabling Translational Medicine with e-Science
 
Datamining in BreastCancer.pptx
Datamining in BreastCancer.pptxDatamining in BreastCancer.pptx
Datamining in BreastCancer.pptx
 

Recently uploaded

Vip sexy Call Girls Service In Sector 137,9999965857 Young Female Escorts Ser...
Vip sexy Call Girls Service In Sector 137,9999965857 Young Female Escorts Ser...Vip sexy Call Girls Service In Sector 137,9999965857 Young Female Escorts Ser...
Vip sexy Call Girls Service In Sector 137,9999965857 Young Female Escorts Ser...Call Girls Noida
 
Basics of Anatomy- Language of Anatomy.pptx
Basics of Anatomy- Language of Anatomy.pptxBasics of Anatomy- Language of Anatomy.pptx
Basics of Anatomy- Language of Anatomy.pptxAyush Gupta
 
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Me
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near MeVIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Me
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Memriyagarg453
 
Call Girls in Mohali Surbhi ❤️🍑 9907093804 👄🫦 Independent Escort Service Mohali
Call Girls in Mohali Surbhi ❤️🍑 9907093804 👄🫦 Independent Escort Service MohaliCall Girls in Mohali Surbhi ❤️🍑 9907093804 👄🫦 Independent Escort Service Mohali
Call Girls in Mohali Surbhi ❤️🍑 9907093804 👄🫦 Independent Escort Service MohaliHigh Profile Call Girls Chandigarh Aarushi
 
(Sonam Bajaj) Call Girl in Jaipur- 09257276172 Escorts Service 50% Off with C...
(Sonam Bajaj) Call Girl in Jaipur- 09257276172 Escorts Service 50% Off with C...(Sonam Bajaj) Call Girl in Jaipur- 09257276172 Escorts Service 50% Off with C...
(Sonam Bajaj) Call Girl in Jaipur- 09257276172 Escorts Service 50% Off with C...indiancallgirl4rent
 
Call Girls Service Chandigarh Gori WhatsApp ❤7710465962 VIP Call Girls Chandi...
Call Girls Service Chandigarh Gori WhatsApp ❤7710465962 VIP Call Girls Chandi...Call Girls Service Chandigarh Gori WhatsApp ❤7710465962 VIP Call Girls Chandi...
Call Girls Service Chandigarh Gori WhatsApp ❤7710465962 VIP Call Girls Chandi...Niamh verma
 
Russian Escorts Aishbagh Road * 9548273370 Naughty Call Girls Service in Lucknow
Russian Escorts Aishbagh Road * 9548273370 Naughty Call Girls Service in LucknowRussian Escorts Aishbagh Road * 9548273370 Naughty Call Girls Service in Lucknow
Russian Escorts Aishbagh Road * 9548273370 Naughty Call Girls Service in Lucknowgragteena
 
Russian Call Girls Gurgaon Swara 9711199012 Independent Escort Service Gurgaon
Russian Call Girls Gurgaon Swara 9711199012 Independent Escort Service GurgaonRussian Call Girls Gurgaon Swara 9711199012 Independent Escort Service Gurgaon
Russian Call Girls Gurgaon Swara 9711199012 Independent Escort Service GurgaonCall Girls Service Gurgaon
 
💚😋Chandigarh Escort Service Call Girls, ₹5000 To 25K With AC💚😋
💚😋Chandigarh Escort Service Call Girls, ₹5000 To 25K With AC💚😋💚😋Chandigarh Escort Service Call Girls, ₹5000 To 25K With AC💚😋
💚😋Chandigarh Escort Service Call Girls, ₹5000 To 25K With AC💚😋Sheetaleventcompany
 
Hot Call Girl In Chandigarh 👅🥵 9053'900678 Call Girls Service In Chandigarh
Hot  Call Girl In Chandigarh 👅🥵 9053'900678 Call Girls Service In ChandigarhHot  Call Girl In Chandigarh 👅🥵 9053'900678 Call Girls Service In Chandigarh
Hot Call Girl In Chandigarh 👅🥵 9053'900678 Call Girls Service In ChandigarhVip call girls In Chandigarh
 
Call Now ☎ 9999965857 !! Call Girls in Hauz Khas Escort Service Delhi N.C.R.
Call Now ☎ 9999965857 !! Call Girls in Hauz Khas Escort Service Delhi N.C.R.Call Now ☎ 9999965857 !! Call Girls in Hauz Khas Escort Service Delhi N.C.R.
Call Now ☎ 9999965857 !! Call Girls in Hauz Khas Escort Service Delhi N.C.R.ktanvi103
 
Leading transformational change: inner and outer skills
Leading transformational change: inner and outer skillsLeading transformational change: inner and outer skills
Leading transformational change: inner and outer skillsHelenBevan4
 
Vip Kolkata Call Girls Cossipore 👉 8250192130 ❣️💯 Available With Room 24×7
Vip Kolkata Call Girls Cossipore 👉 8250192130 ❣️💯 Available With Room 24×7Vip Kolkata Call Girls Cossipore 👉 8250192130 ❣️💯 Available With Room 24×7
Vip Kolkata Call Girls Cossipore 👉 8250192130 ❣️💯 Available With Room 24×7Miss joya
 
Dehradun Call Girls Service ❤️🍑 8854095900 👄🫦Independent Escort Service Dehradun
Dehradun Call Girls Service ❤️🍑 8854095900 👄🫦Independent Escort Service DehradunDehradun Call Girls Service ❤️🍑 8854095900 👄🫦Independent Escort Service Dehradun
Dehradun Call Girls Service ❤️🍑 8854095900 👄🫦Independent Escort Service DehradunNiamh verma
 
No Advance 9053900678 Chandigarh Call Girls , Indian Call Girls For Full Ni...
No Advance 9053900678 Chandigarh  Call Girls , Indian Call Girls  For Full Ni...No Advance 9053900678 Chandigarh  Call Girls , Indian Call Girls  For Full Ni...
No Advance 9053900678 Chandigarh Call Girls , Indian Call Girls For Full Ni...Vip call girls In Chandigarh
 
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...High Profile Call Girls Chandigarh Aarushi
 

Recently uploaded (20)

Vip sexy Call Girls Service In Sector 137,9999965857 Young Female Escorts Ser...
Vip sexy Call Girls Service In Sector 137,9999965857 Young Female Escorts Ser...Vip sexy Call Girls Service In Sector 137,9999965857 Young Female Escorts Ser...
Vip sexy Call Girls Service In Sector 137,9999965857 Young Female Escorts Ser...
 
Call Girl Lucknow Gauri 🔝 8923113531 🔝 🎶 Independent Escort Service Lucknow
Call Girl Lucknow Gauri 🔝 8923113531  🔝 🎶 Independent Escort Service LucknowCall Girl Lucknow Gauri 🔝 8923113531  🔝 🎶 Independent Escort Service Lucknow
Call Girl Lucknow Gauri 🔝 8923113531 🔝 🎶 Independent Escort Service Lucknow
 
Model Call Girl in Subhash Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Subhash Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Subhash Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Subhash Nagar Delhi reach out to us at 🔝9953056974🔝
 
Basics of Anatomy- Language of Anatomy.pptx
Basics of Anatomy- Language of Anatomy.pptxBasics of Anatomy- Language of Anatomy.pptx
Basics of Anatomy- Language of Anatomy.pptx
 
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Me
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near MeVIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Me
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Me
 
Call Girls in Mohali Surbhi ❤️🍑 9907093804 👄🫦 Independent Escort Service Mohali
Call Girls in Mohali Surbhi ❤️🍑 9907093804 👄🫦 Independent Escort Service MohaliCall Girls in Mohali Surbhi ❤️🍑 9907093804 👄🫦 Independent Escort Service Mohali
Call Girls in Mohali Surbhi ❤️🍑 9907093804 👄🫦 Independent Escort Service Mohali
 
(Sonam Bajaj) Call Girl in Jaipur- 09257276172 Escorts Service 50% Off with C...
(Sonam Bajaj) Call Girl in Jaipur- 09257276172 Escorts Service 50% Off with C...(Sonam Bajaj) Call Girl in Jaipur- 09257276172 Escorts Service 50% Off with C...
(Sonam Bajaj) Call Girl in Jaipur- 09257276172 Escorts Service 50% Off with C...
 
Call Girls Service Chandigarh Gori WhatsApp ❤7710465962 VIP Call Girls Chandi...
Call Girls Service Chandigarh Gori WhatsApp ❤7710465962 VIP Call Girls Chandi...Call Girls Service Chandigarh Gori WhatsApp ❤7710465962 VIP Call Girls Chandi...
Call Girls Service Chandigarh Gori WhatsApp ❤7710465962 VIP Call Girls Chandi...
 
Russian Escorts Aishbagh Road * 9548273370 Naughty Call Girls Service in Lucknow
Russian Escorts Aishbagh Road * 9548273370 Naughty Call Girls Service in LucknowRussian Escorts Aishbagh Road * 9548273370 Naughty Call Girls Service in Lucknow
Russian Escorts Aishbagh Road * 9548273370 Naughty Call Girls Service in Lucknow
 
Russian Call Girls Gurgaon Swara 9711199012 Independent Escort Service Gurgaon
Russian Call Girls Gurgaon Swara 9711199012 Independent Escort Service GurgaonRussian Call Girls Gurgaon Swara 9711199012 Independent Escort Service Gurgaon
Russian Call Girls Gurgaon Swara 9711199012 Independent Escort Service Gurgaon
 
💚😋Chandigarh Escort Service Call Girls, ₹5000 To 25K With AC💚😋
💚😋Chandigarh Escort Service Call Girls, ₹5000 To 25K With AC💚😋💚😋Chandigarh Escort Service Call Girls, ₹5000 To 25K With AC💚😋
💚😋Chandigarh Escort Service Call Girls, ₹5000 To 25K With AC💚😋
 
Hot Call Girl In Chandigarh 👅🥵 9053'900678 Call Girls Service In Chandigarh
Hot  Call Girl In Chandigarh 👅🥵 9053'900678 Call Girls Service In ChandigarhHot  Call Girl In Chandigarh 👅🥵 9053'900678 Call Girls Service In Chandigarh
Hot Call Girl In Chandigarh 👅🥵 9053'900678 Call Girls Service In Chandigarh
 
Call Now ☎ 9999965857 !! Call Girls in Hauz Khas Escort Service Delhi N.C.R.
Call Now ☎ 9999965857 !! Call Girls in Hauz Khas Escort Service Delhi N.C.R.Call Now ☎ 9999965857 !! Call Girls in Hauz Khas Escort Service Delhi N.C.R.
Call Now ☎ 9999965857 !! Call Girls in Hauz Khas Escort Service Delhi N.C.R.
 
College Call Girls Dehradun Kavya 🔝 7001305949 🔝 📍 Independent Escort Service...
College Call Girls Dehradun Kavya 🔝 7001305949 🔝 📍 Independent Escort Service...College Call Girls Dehradun Kavya 🔝 7001305949 🔝 📍 Independent Escort Service...
College Call Girls Dehradun Kavya 🔝 7001305949 🔝 📍 Independent Escort Service...
 
Leading transformational change: inner and outer skills
Leading transformational change: inner and outer skillsLeading transformational change: inner and outer skills
Leading transformational change: inner and outer skills
 
Vip Kolkata Call Girls Cossipore 👉 8250192130 ❣️💯 Available With Room 24×7
Vip Kolkata Call Girls Cossipore 👉 8250192130 ❣️💯 Available With Room 24×7Vip Kolkata Call Girls Cossipore 👉 8250192130 ❣️💯 Available With Room 24×7
Vip Kolkata Call Girls Cossipore 👉 8250192130 ❣️💯 Available With Room 24×7
 
Dehradun Call Girls Service ❤️🍑 8854095900 👄🫦Independent Escort Service Dehradun
Dehradun Call Girls Service ❤️🍑 8854095900 👄🫦Independent Escort Service DehradunDehradun Call Girls Service ❤️🍑 8854095900 👄🫦Independent Escort Service Dehradun
Dehradun Call Girls Service ❤️🍑 8854095900 👄🫦Independent Escort Service Dehradun
 
No Advance 9053900678 Chandigarh Call Girls , Indian Call Girls For Full Ni...
No Advance 9053900678 Chandigarh  Call Girls , Indian Call Girls  For Full Ni...No Advance 9053900678 Chandigarh  Call Girls , Indian Call Girls  For Full Ni...
No Advance 9053900678 Chandigarh Call Girls , Indian Call Girls For Full Ni...
 
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...
 
VIP Call Girls Lucknow Isha 🔝 9719455033 🔝 🎶 Independent Escort Service Lucknow
VIP Call Girls Lucknow Isha 🔝 9719455033 🔝 🎶 Independent Escort Service LucknowVIP Call Girls Lucknow Isha 🔝 9719455033 🔝 🎶 Independent Escort Service Lucknow
VIP Call Girls Lucknow Isha 🔝 9719455033 🔝 🎶 Independent Escort Service Lucknow
 

Data Science in Healthcare -The University Malaya Medical Centre Breast Cancer Ecosystem

  • 1. D A T A S C I E N C E I N H E A L T H C A R E T H E U N I V E R S I T Y M A L A Y A M E D I C A L C E N T R E B R E A S T C A N C E R E C O S Y S T E M A S S O C . P R O F D R S A R I N D E R K . D H I L L O N D ATA S C I E N C E & B I O I N F O R M AT I C S L A B U N I V E R S I T Y O F M A L AYA , K U A L A L U M P U R , M A L AY S I A S A R I N D E R @ U M . E D U . M Y H T T P : / / S A R I N D E R K A U R . C O M H T T P S : / / U M E X P E R T . U M . E D U . M Y / S A R I N D E R
  • 3. Data Science in Healthcare Transforming healthcare Data is used effectively Personalised medicine Predicting factors for improved decision making Historical data use in analytics Artificial enabled systems Electronic medical records/Electronic health records
  • 4. F O C U S O F TA L K U N I V E R S I T Y M A L AYA M E D I C A L C E N T R E ( U M M C ) D ATA S C I E N C E H E A LT H C A R E E C O S Y S T E M
  • 5. UMMC Data Science Healthcare Ecosystem Machine Learning for Prediction MyBCC ( Malaysian Breast Cancer Cohort Study) Deep Learning for Classification Text Mining in Radiology UMMC Breast cancer clinical data & Electronic Medical Records
  • 6. DEVELOPMENT OF ELECTRONIC MEDICAL RECORD FOR CLINICAL & RESEARCH PURPOSES: B R E A S T C A N C E R C L I N I C A L D ATA M A N A G E M E N T U S I N G P O I N T- O F - C A R E A N D M U LT I D I S C I P L I N A R Y D ATA C O L L E C T I O N FIRST PROJECT
  • 7. Bernama (2018, November 8). Health Minister: EMR system to be implemented at 145 hospitals within the next three years. The Star. Brar, K. (2018, January 31). Aiding healthcare through data analytics. The Star. “The new technology would facilitate medical practitioners, including doctors and nurses to identify and share patients' medical consultation information, as well as prescription of medicines through an integrated system.” “With Big Data, healthcare organisations have the ability of information exchange, leading to a 360° view of their patients, so doctors can give a more complete diagnosis.” In the News
  • 8. Common Scenario: Limitations with clinical data management
  • 9. O B J E C T I V E To enhance current i-Pesakit© system and achieve interoperability by consolidating and automating manual retrieval processes of surgical, oncology, radiology, pathology and pharmacy data into mineable data for clinical use, secondary data usage in audits and breast cancer research activities (iResearch) CLINIC VISITS CLINICAL AUDIT & REPORTING DIAGNOSIS & TREATMENT CLINICAL RESEARCH
  • 10. Interdisciplinary Research Concept Bridging the gap between research & clinical practice
  • 11. Architecture System of the UMMC i-Pesakit© Breast Cancer Module
  • 12. I-pesakit© Breast Cancer Module During Weekly MDT Meeting
  • 13. System Content Features and Implementation into Clinical Workflow
  • 14. i-Pesakit Breast Cancer Chemotherapy e-Prescription
  • 15. Clinicians’ Perception towards the System Usage of i-Pesakit© Breast Cancer Module in Clinical Work Is this system worth the time and effort to be used? Effect on Clinical Workflow by Using i-Pesakit© Breast Cancer Module Among Clinicians
  • 16. • Enhanced EMR provides point-of-care data in clinics, wards and MDT meetings • Ability to produce reports on breast cancer survival in Malaysia for research, individual hospital performance for policymakers to track outcomes and provide direction in cancer control. • A time effective approach while producing new knowledge through data mining and analysis – Research & Development Alternate Clinical Data Management Using Point-of-care and Multidisciplinary Data Collection Nor, N. A. M., Taib, N. A., Saad, M., Zaini, H. S., Ahmad, Z., Ahmad, Y., & Dhillon, S. K. (2018). Development of electronic medical records for clinical and research purposes : the breast cancer module using an implementation framework in a middle income country- Malaysia. BMC Bioinformatics, 19(Suppl 13), 1– 16. http://doi.org/10.1186/s12859-018-2406-9) (ISI-Indexed)
  • 19. SECOND PROJECT Application of machine learning to predict the important clinical prognostic factors affecting survival rate of breast cancer 8066 samples from University Malaya Medical Centre (UMMC) (1993-2018) 23 clinical factors 1 target variable (Life status)
  • 20. METHODOLOGY – Supervised learning (SL)
  • 21. MACHIN E LEARNI NG PIPELI NE 1. Model Evaluation 2. Random Forest Further Modelling 3. Variable Selection 4. Decision Tree 5. Survival Curves Validation
  • 22. RESULTS – SL – Model evaluation No Algorithm Accuracy (%) Sensitivity Specificity AUC Precision Matthews correlation coefficient 1 Decision tree 79.80 0.82 0.75 0.72 0.91 0.52 2 Random forest 82.70 0.83 0.81 0.86 0.93 0.59 3 Neural networks 82.00 0.83 0.79 0.84 0.93 0.58 4 Extreme boost 81.70 0.84 0.75 0.87 0.89 0.57 5 Logistic regression 81.10 0.82 0.78 0.85 0.92 0.55 6 Support vector machine 81.80 0.81 0.84 0.85 0.95 0.57
  • 24. RESULTS – SL – Variable importance
  • 25. PART 1 RESULTS – SL – Variable importance
  • 26. PART 1 RESULTS – SL – Important variables
  • 27. RESULTS – SL – Decision tree Decision tree for all data; Shows that patients with curable cancer, ≤ 1 positive lymph nodes (PLN) and ≤ 2 total axillary lymph nodes removed (TLN) had 50% survival probability, while patients with pre-cancer, ≤ 1 PLN and ≤ 2 TLN had 90% survival probability. Patients with metastatic cancer, > 6 PLN and > 6 TLN had only 25% survival probability
  • 28. PART 1 RESULTS – SL – Survival curves
  • 29. VALIDATION USING AJCC MANUAL 5th edition Variables AJCC Machine learning Classificat ion of tumor size < 2.0cm 1.0 < 4.0cm >4.0cm <2.5cm 2.5 < 4.8cm 4.8 < 11.0cm >11.0cm Number of positive lymph nodes ≤ 3 3 < 6 > 6 < 3 3 < 9 > 9
  • 30. SECOND PROJECT Application of machine learning to predict the important clinical prognostic factors affecting survival rate of breast cancer 8066 samples from University Malaya Medical Centre (UMMC) (1993-2018) 23 clinical factors 1 target variable (Life status)
  • 31. METHODOLOGY – Unsupervised learning (UL) • Using Gap Statistics method, factoextra library in R • Labels of variables suggest the value of K (number of cluster) • Run K-means and record distance, V • V minimizes when K = n • Visualize through scree plot 1. Determine optimal number of cluster • Compare hierarchical, k-means and PAM (Partitioning around Medoids)) using clValid R library • Measured using connectivity, Dunn and Silhouette 2. Determine most suitable clustering method • Variables subdivided into groups by cutting at a desired similarity level • Function dist() in hclust library was used to calculate a dissimilarity matrix as an input • Dendrogram was visualise using fviz_dend() function in factoextra library 3. Perform hierarchical clustering • Clinicians validate the patterns of variables in each cluster • Results to be compared with real- time survival analysis 4. Validate the pattern of clusters
  • 32. RESULTS – UL Optimal number of cluster = 6 Suitable clustering method= hierarchical clustering METHOD CONNECTIV I-TY DUNN SILHOUETT E hierarchica l 16.45 0.72 0.38 kmeans 16.34 0.68 0.38 pam 20.70 0.34 0.16
  • 33. RESULTS – UL – Hierarchical clustering V1: Age V22: Total lymph nodes V21: Hormonal therapy V12: ER status V13: PR status V18: Method of axillary lymph node dissection V2: Marital status V7: Classification of breast cancer V16: Surgery status V19: Radiotherapy V20: Chemotherapy V3: Menopausal status V8: Laterality V6: Method of diagnosis V15: Primary treatment type V10: Grade of differentiation V14: cerb2 status V17: Type of surgery V5: Race V4: Presence of family history V9: Cancer stage V11: Tumor size V23: Positive lymph nodes
  • 36. D E E P L E A R N I N G A P P R O A C H E S : L E S I O N C L A S S I F I C AT I O N THIRD PROJECT
  • 37. Classification of benign and malignant lesions in breast ultrasound images: Benign? Or Malignant?
  • 38. DATASET • Collected from UMMC between 2012-2013 • 83 Patients • Size of images: 1400 X1050 • 140 Benign and 136 Malignant images(Biopsy Confirmed) Pre-Processing • Augmentation  Rotation  Shift  Zoom  Flip  Shear • Zero-Center Data • Normalize data
  • 39. TRADITIONALLY : Handcrafted Features Machine Learning Prediction Benign Malignant
  • 40. OUR STUDY Deep Learning Transfer Learning Feature Extraction VGG16 Fine Tuning ResNet50 Fine Tuning VGG-FC VGG-SVM VGG-DT VGG-RF VGG-AB ResNet-FC ResNet-SVM ResNet-DT ResNet-RF ResNet-AB • We use VGG16 and ResNet50 pre-trained on Breast US image dataset • We used Convolutional layers of VGG16 and ResNet50 for feature extraction • 12 Deep Learning models that we used in our study
  • 42. References Dataset Deep learning Models Performance [29] 4254 benign GoogLeNet Accuracy: 91.23% 3154 malignant Sensitivity: 84.29% Specificity: 96.07% [30] 135 benign Boltzmann Accuracy: 93.4% 92 malignant Sensitivity: 88.6% Specificity: 97.1% [31] 100 benign Deep Polynomial network+SVM Accuracy: 92.40% 100 malignant Sensitivity: 92.67% Specificity: 91.36% [32] 275 benign Stacked denoising Autoencoder Accuracy: 82.4% 245 malignant Sensitivity: 78.7% Specificity: 85.7% Current Study 249 benign Attention VGG16 + ensembled loss Accuracy: 93% 190 malignant Sensitivity: 96% Specificity: 90% THE STATE OF THE ART OF DEEP LEARNING MODELS IN BREAST ULTRASOUND LESION CLASSIFICATION.
  • 43. FOURTH PROJECT OBJECTIVE To develop a fully automated AI-enabled database platform with interactive visualisations for researchers and clinicians using The Malaysian Breast Cancer Survivorship Cohort (MyBCC) Study Baseline 6 Months 1 Year 3 Years 5 Years 5 Timelines 603 Variables 909 Samples May 2019
  • 44. FOURTH PROJECT MYBCC Baseline 6 Months 1 Year 3 Years 5 Years 603 Variables 800 Patients
  • 46. FOURTH PROJECT METHODOLOGY – Automated machine learning
  • 47. FOURTH PROJECT– Automated machine learning Algorithm 1. Cartesian product to select lifestyle and clinical factors from different tables Select 13 lifestyle factors, life status and survival years from table, mybcc and 4 clinical factors from table, clinical where the mybcc.RN = clinical.RN (select samepatient ID from both tables) r = 𝝈mybcc.RN = clinical.RN ((πRN,l1,l2,l3…,l13,lifestatus,survivalyears (mybcc)) × (πRN,c1,c2,c3.c4 (clinical))) Definition: r = relational database 𝝈 = selection Π = projection × = Cartesian product mybcc = data table, which contains lifestyle factors clinical = data table, which contains clinical factors RN = patient ID/ primary key in both tables lifestatus = life status of the patients (Alive/Dead) survivalyears = Overall survival years of the patients l1,l2,l3…l13 = 13 lifestyle factors c1,c2,c3,c4 = 4 clinical factors
  • 48. FOURTH PROJECT– Automated machine learning Algorithm 2. Python-HTML integration for automated machine learning a = query1 + ((pm1,pm2,…pmn)+ ps + ph) Definition: a = automated analysis query1 = 𝝈mybcc.RN = clinical.RN ((πRN,l1,l2,l3…,l14,lifestatus,survivalyears (mybcc)) × (πRN,c1,c2,c3.c4 (clinical) (Refer to Algorithm 1) pm1,pm2,…pmn = Pyhton modules ps = Python script to run each analysis ph = Python-HTML connection via cgitb
  • 49. FOURTH PROJECT– Automated machine learning Algorithm 3. Model of the automated visualisation from database v = query2 + (c1,c2,…,cn) Definition: v = visualisation query2 = query1 + ((pm1,pm2,…pmn) + ps + ph) (Refer to Algorithm 2) c1,c2,…,cn = different types of charts
  • 50. FOURTH PROJECT RESULTS – Digitized questionnaire
  • 51. FOURTH PROJECT– Example of automated machine learning
  • 52. FOURTH PROJECT– Automated quality of life scoring
  • 56. FIFTH PROJECT TEXT MINING .in+0=========== REPORT TEXT ==========.br.br.br.br.br.br.brMRI THORACOLUMBAR +C of 13-Oct-2015:.br.brIndication.brMetastatic breast carcinoma. Currently complaint of weakness of the lower limb bilaterally. TRO cord involvement/ compression..br.brSequences.brCoronal T1W, T2W.brSagittal T1W,T2W, CISS3D.brAxial CISS3D, T2W.brPost gad- sag, axial.br.brFindings.brCorrelation made with previous CT dated 8.6.2015..br.brNormal spinal alignment..brMultilevel enhancing high signal intensity on T1/T2W/STIR is seen in the thoracic and lumbar vertebral bodies, sacrum, both ilium and sternum..brReduced T12 vertebral body height..brThey are expansion of the vertebral body at T3, T6, T9, T10 and T12 levels causing indentation of the spinal cord worst at the T12 level with AP diameter of the spinal canal measures 0.8cm. .brThe rest of vertebral body heights are preserved..brThe intervertebral disc returns normal signal..brThe spinal cord ends at L1. No intramedullary lesion seen or high signals seen in spinal cord at T2WI. .br.brImpression.brBreast carcinoma with .br1. Multiple vertebral body metastasis with multilevel spinal cord indentation..br2. Pathological compression fracture and spinal canal stenosis at T12 level .br.br.brDrs Vithya / Nur Aishah / DR NAZRI.br.br.br.br.br.br.brReport Written By: Dr MOHAMMAD NAZRI BIN MD. SHAH? 13-OCT- 2015 05:05 PM.brReport Approved By : Dr MOHAMMAD NAZRI BIN MD. SHAH? 13-OCT-2015 05:05 PM.br.br.br========== REPORT TEXT END =========.br.in-0
  • 58. mmo Towards an AI Enabled Digital Platform for Modern Cancer Healthcare A Fully Automated EMR for BCM (iPesakit BCM) AI Enabled Breast Cancer Research EMR AI Powered Sentiment Analysis Research Databases 1 2 3 Public Opinio n Mobile Application for E- Communication between doctors, nurses and patients 4 Monitoring & Predictions
  • 59. Summary • Data Science Is Reshaping Healthcare • Techniques Such As Machine Learning, Deep Learning, Text Mining Are The Core of Data Science • UMMC Is One of The Primary Hospital In The Country To Embark On Data Science Projects • The Data Science Pipeline Produced And Tested In These Projects Will Be Used In Other Asian Hospitals THANK YOU FOR LISTENING
  • 60. Professor. Dr. Nur Aishah Binti Mohd Taib, UM, Dr. Nurul Aqilah Mohd Nor, Bioinformatics, UM, Dr. Tania Islam, MyBCC Project Manager, UM, Professor Pietro Lio, Adviser, University of Cambridge, UK, Tan Wee Ming, Programmer, student, Bioinformatics, UM, Dr. Elham Yousef Kalafi, Researcher, Bioinformatics, UM Mogana Darshini Ganggayah, student, Bioinformatics, UM

Editor's Notes

  1. There are 4 studies to be discussed.
  2. Newspaper clips from NST and the Star in 2018 Acknowledges in importance of good structured clinical big data management. Highlighting on improving the quality service in healthcare. However… The EMR innovation matches the UMMC direction and priorities towards the hospital’s mission, especially in bridging the gap between clinical practice and research through and efficient data system. Involves multi diciplines; Clinicians and nurses as front line staff who will deliver the innovation System structure design by bioinformatics and module content by clinicians , Implementation by the IT experts
  3. Data capture methods had been manual and done retrospectively by tracing notes of patients’ clinical and treatment characteristics. This method is expensive with high probability of missing values and inaccuracies. Reducing manual work by automated data capture systems increase workflow efficiency as well as better research outcomes. In a typical clinical set up, these primary data are used for surgical audits in measuring the hospital performance, while the secondary use data will be used in epidemiological analysis in breast cancer outcome research. Aggregating data from different sources in healthcare and research is important 17. The effectiveness and data quality of records can be improved through the enhancement of the research database features. Elements needed for a successful clinical research database include engagement of clinicians, utility for research and the ability to integrate with the legacy systems 18 .
  4. Advances in the scientific field, particularly in the medical domain has led to an increase of clinical data production which offers enhancement opportunities for clinical research sector. More initiatives have to be expanded through interconnection of computational strategies such as electronic medical record (EMR) to interact with research platforms. EMR is primarily designed to meet clinical practice needs for patient care. As the usage of EMR expands, there are more opportunities in extending the system through data interoperability to facilitate breast cancer clinical research activities. Our objective is to extend EMR and develop a breast cancer research knowledgebase system for easy data access, secondary data use for data mining, and interoperability between multiple clinical departments. The rationale of establishing the proposed EMR system is to provide convenient data access to users in a typical clinical set up. In this study, we introduce a breast cancer research clinical data management system integrated with EMR in the clinical environment of diagnosed breast cancer patients in UMMC. We adopted the Quality Implementation Framework (QIF) because it synthesizes existing models and research support to provide a conceptual overview of the critical steps that comprise quality implementation
  5. There are five groups of crucial members in this EMR implementation; project manager and critical stakeholders include hospital management and governance team, physician champions system design and development, as well as the evaluation and quality assurance teams. The project manager is the lead person in facilitating these implementation steps, connects different implementation phases and coordinate the planning, design, development, and testing phases between team members. The hospital management and governance team from the Patient Information Department provide feedback on governance and policy matters pertaining to data sharing, privacy and confidentiality. Physician champions have credibility with clinical staffs and hospital administration, to promote value of the innovation through stakeholders engagements. They are also the main point of reference from the clinical perspective, also as content experts and EMR functionalities so the digital workflow matches closely to the actual clinical workflow. The system designers (bioinformaticians) act as a liaison between physician champions and system development team (IT staffs) in connecting ideas and suitable concepts. Bioinformaticians design the digital system workflow, templates and structure through gathering EMR requirements from physician champions and put in technical form for system developers to take into development phase. IT staffs are responsible in building, customizing and deploying the breast cancer module, as well as providing maintenance service of the system to be conducted by the evaluation and quality assurance team (breast care nurse). Nurses conduct on-site system testing and performance review and coordinates training for users within the practice and system use.
  6. MyBCC : Malaysian Breast Cancer Survivorship Cohort Breast Q : patients experience with the service, quality measurement
  7. Gone Live since February 2016 VIDEO
  8. We have been working closely with the clinical specialists as well as IT experts in coming out with the best solution that would benefit both clinicians and researchers. Progressed with Onco department ; in e-prescribing chemotherapy, which links to the Pharm dept
  9. Survey results based on system evaluation test survey conducted at Department of Surgery on 18/7/2017 - Clinicians spend less time interacting with patients Lessons learnt : -Personal Data Protection Act (PDPA) Challenges : -System testing - User training
  10. What do we propose? The Ninth Malaysia Plan (9MP) which is a national budget focus for 2009 highlighted on strengthening the Health Information System, to improve the point-of-care service and information access however till date, the success rate has been low. There is an absolute urgency in developing a reliable, integrated and interoperable Health Information Management using an implementation framework. Enhanced EMR provides point-of-care data for clinical visits which allows data tracking over time, useful in identifying patients who are due for preventive visits and screenings, cancer-follow up, monitoring recurrence and death. Ability to produce reports on BC survival in Malaysia to be used by stakeholders (clinicians, researchers, and the government) is essential for research, individual hospital performance for policymakers to track outcomes and provide direction in cancer control. A time effective approach while producing new knowledge through data mining and analysis – Research & Development
  11. The first one is implementation of machine learning in prediction of important factors influencing survival rate of breast cancer. We used 6 different machine learning algorithms to evaluate the dataset which contains 8066 patient’s records with 23 predictors.
  12. There are 5 steps in the methodology (Model evaluation, Random forest further modelling, variable selection, decision tree and survival curves for validation
  13. In calibration analysis, all the algorithms show well calibrated measure for the provided dataset. The support vector machine classifier produced a sigmoid curve due to the margin property of hinge loss as it focused on hard samples closer to decision boundaries (the support vectors). The dataset for the prediction of breast cancer survival (‘all data’) seemed sufficiently reliable to proceed with the other steps, mainly because the calibration measures were closer to the diagonal or identity.
  14. Both tumour size and positive lymph nodes are the determinants of the stage of breast cancer, according to AJCC and as expected, these were predicted as important variables in variable selection process in this study. The tumour size separation in this study were (<2.5 cm, 2.5 < 4.8 cm, 4.8 < 11 cm, and >11 cm), while the AJCC manual categorised the TS as less than or equals to 2 cm (T1) for Stage I, 2 – 4 cm (T2) for Stage II, and more than 4 cm (T3) for Stage III. The positive lymph nodes separation generated from DT analysis of this study were (<3, 3<9, and 9<18), whereas in the AJCC staging, positive lymph nodes of less than or equals to 3 (N1) fell under Stage II breast cancer, PLN between 3 and 6 (N2) was categorised as Stage IIIA, and PLN exceeding 6 (N3) was under Stage IIIB.
  15. MyBCC study was started in 2012 to conduct a cohort study on different lifestyle factors influencing survival rate of breast cancer patients from multi-ethnic origin in Malaysia. There are 5 different timelines which are baseline (diagnosis), 6 months, 1 year, 3 years and 5 years. There are 603 variables/ questions and 800 patients recruited until 2018.
  16. Different type of visualizations for other variables are in progress.
  17. Radiologist used to report their findings from mammogram or ultrasound in free-text, moreover different radiologists have different ways of narrating the report. Thus, a structured radiology reporting system is needed in medical field.