SlideShare a Scribd company logo
Revolutionizing
Patient Data Privacy:
Federated Learning & Clustering
in Healthcare
Deep dive into “Privacy-preserving patient clustering for
personalized federated learning” written by Ahmed
Elhussein and Gamze Gursoy and presented at the Machine
Learning for Healthcare 2023 conference
Presented by Kelly Nguyen
Research
Introduction to Healthcare Data Privacy
Deep Learning in Healthcare
Challenges in Patient Data Clustering
Federated Learning as a Privacy Beacon
Innovations of PCBFL
Methodology
Results and Significance
Conclusion
Personal Insights
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Overview
Research
Paper: Privacy-preserving
patient clustering for
personalized federated learning
Written by Ahmed Elhussein and
Gamze Gursoy
Presented at Machine Learning for
Healthcare 2023
Focus on non-IID data and privacy
in federated learning
In the realm of healthcare, the sanctity of patient data is paramount
With the advent of digital record-keeping, the potential for sensitive information to be
compromised has escalated, necessitating robust solutions to ensure privacy.
The critical need for safeguarding healthcare data cannot be overstated
not just a matter of legal compliance, but of maintaining the trust at the very core of the
patient-provider relationship.
Introduction to Healthcare
Data Privacy
Enhances disease risk prediction,
diagnostic support
Relies on large EHR datasets
Limitations
overfitting and poor generalization in
small datasets
Deep
Learning in
Healthcare
Challenges in
Patient Data
Clustering
1
Necessity for
personalized FL
models
2
Risks of traditional
clustering methods
3
Privacy conflicts
with data sharing
A groundbreaking technique that promises to revolutionize the way medical data
is utilized for research and analysis
Approach allows for the development of powerful, predictive models by learning from
decentralized datasets, without ever compromising the privacy of the individual’s data.
By processing data locally and sharing only model updates instead of raw information,
federated learning provides a blueprint for harnessing the collective power of
healthcare data while upholding the inviolable principle of patient privacy.
Federated Learning as a
Privacy Beacon
Innovations
of PCBFL
Privacy-preserving Community-Based
Federated Learning (PCBFL)
Utilizes Secure Multiparty Computation (SMPC) to
enable the clustering of patient data without
exposing individual patient information
3
Methodology
Cohort and
Feature
Extraction
Privacy Preserving
CBFL Protocol
Model Training Evaluation
In order to create a more realistic FL scenario, a
subsampling approach was used, where the size of the
dataset in each site was limited.
The study utilized the eICU collaborative research
database, comprising data from over 200,000 patients
across the U.S., to predict ICU mortality using variables like
diagnosis, medications, and physical exams within the first
48 hours of admission.
Focused primarily on patients with complete data to
avoid imputation bias, leading to a cohort of 5,000
patients from 20 sites, aligning with a realistic
federated learning scenario with limited site data.
Cohort and
Feature
Extraction
Creating Patient
Embeddings
Estimating Patient
Similarity Securely
Clustering Patients Predicting Mortality
PCBFL
Protocol
A federated autoencoder is
trained to obtain latent
variables for each feature
domain.
SMPC uses a secret sharing
scheme to jointly calculate
the dot product between
pairs of vectors
Spectral clustering to cluster
the patients using similarity
matrix generated from
pairwise cosine similarities
of embeddings
Cluster-based FL training.
Each model is separately
trained per cluster.
Feed Forward models were implemented with ReLU activation in the hidden
layers and sigmoid activation in the final output layers.
Model Training
Analyzed patient clusters formed by Privacy-
preserving Community-Based Federated
Learning (PCBFL) and Community-Based
Federated Learning (CBFL), assessing
mortality rates and feature distributions
Evaluation
The data was divided into training and testing
sets at a 70:30 ratio.
Given the imbalance in the dataset with only
20% positive labels, performance was
evaluated using both AUC and AUPRC scores.
The models were run 100 times to calculate
mean scores and determine 95% confidence
intervals with bootstrap estimation.
Results
Things that PCBFL provided includes
privacy-preserving and accurate
patient similarity scores using secure
multiparty computation, clinically
meaningful clusters, and an increase
in predictive performance.
All in all, the paper presents a personalized federated learning (FL) framework that
integrates data mining techniques like clustering to address data heterogeneity and
privacy in healthcare.
The study overall underscores PCBFL’s promise for collaborative healthcare data analysis
which enhances patient privacy and future research aims to assess its generalizability
across different healthcare datasets and domains
Conclusion
I found this research very interesting because not only does it explore traditional deep mining
tactics such as clustering, but it goes into advanced modernized solutions such as federated
learning.
With personal experience working in biotech, the rise of using data mining and deep learning in
healthcare is not only revolutionary, but also extraordinary in improving the quality of
healthcare itself.
I look forward to reading more research and articles on just that.
Personal Insights
Thank You
For further questions, feel free to email me at
kelly.nguyen01@sjsu.edu

More Related Content

Similar to Short Story Assignment by Kelly Nguyen

NURS 521 Nursing Informatics And Technology.docx
NURS 521 Nursing Informatics And Technology.docxNURS 521 Nursing Informatics And Technology.docx
NURS 521 Nursing Informatics And Technology.docx
stirlingvwriters
 
readmission paper applied health econ
readmission paper applied health econreadmission paper applied health econ
readmission paper applied health econ
Pamela Bonifay Peele
 
INTEGRATING MACHINE LEARNING IN CLINICAL DECISION SUPPORT SYSTEMS
INTEGRATING MACHINE LEARNING IN CLINICAL DECISION SUPPORT SYSTEMSINTEGRATING MACHINE LEARNING IN CLINICAL DECISION SUPPORT SYSTEMS
INTEGRATING MACHINE LEARNING IN CLINICAL DECISION SUPPORT SYSTEMS
hiij
 
INTEGRATING MACHINE LEARNING IN CLINICAL DECISION SUPPORT SYSTEMS
INTEGRATING MACHINE LEARNING IN CLINICAL DECISION SUPPORT SYSTEMSINTEGRATING MACHINE LEARNING IN CLINICAL DECISION SUPPORT SYSTEMS
INTEGRATING MACHINE LEARNING IN CLINICAL DECISION SUPPORT SYSTEMS
hiij
 
Introduction Healthcare system is considered one of the busiest.pdf
Introduction Healthcare system is considered one of the busiest.pdfIntroduction Healthcare system is considered one of the busiest.pdf
Introduction Healthcare system is considered one of the busiest.pdf
bkbk37
 
Informatics and nursing 2015 2016.odette richards
Informatics and nursing 2015 2016.odette richardsInformatics and nursing 2015 2016.odette richards
Informatics and nursing 2015 2016.odette richards
Odette Richards
 
Chapter 4 Knowledge Discovery, Data Mining, and Practice-Based Evi.docx
Chapter 4 Knowledge Discovery, Data Mining, and Practice-Based Evi.docxChapter 4 Knowledge Discovery, Data Mining, and Practice-Based Evi.docx
Chapter 4 Knowledge Discovery, Data Mining, and Practice-Based Evi.docx
christinemaritza
 
Real world Evidence and Precision medicine bridging the gap
Real world Evidence and Precision medicine bridging the gapReal world Evidence and Precision medicine bridging the gap
Real world Evidence and Precision medicine bridging the gap
ClinosolIndia
 
By administering assessments and analyzing the results, targeted a
By administering assessments and analyzing the results, targeted aBy administering assessments and analyzing the results, targeted a
By administering assessments and analyzing the results, targeted a
TawnaDelatorrejs
 
Machine Learning Algorithms for Predictive Analytics in Precision Medicine
Machine Learning Algorithms for Predictive Analytics in Precision MedicineMachine Learning Algorithms for Predictive Analytics in Precision Medicine
Machine Learning Algorithms for Predictive Analytics in Precision Medicine
ClinosolIndia
 
Enabling Analytics on Sensitive Medical Data with Secure Multiparty Computation
Enabling Analytics on Sensitive Medical Data with Secure Multiparty ComputationEnabling Analytics on Sensitive Medical Data with Secure Multiparty Computation
Enabling Analytics on Sensitive Medical Data with Secure Multiparty Computation
Wessel Kraaij
 
Improving health care outcomes with responsible data science
Improving health care outcomes with responsible data scienceImproving health care outcomes with responsible data science
Improving health care outcomes with responsible data science
Wessel Kraaij
 
The Use of Health Information Technology to Improve Care and .docx
The Use of Health Information Technology to Improve Care and .docxThe Use of Health Information Technology to Improve Care and .docx
The Use of Health Information Technology to Improve Care and .docx
pelise1
 
J1803026569
J1803026569J1803026569
J1803026569
IOSR Journals
 
Complex Health Data Visualization
Complex Health Data VisualizationComplex Health Data Visualization
Complex Health Data Visualization
Nicholas Tenhue
 
Exploring outcomes 12345
Exploring outcomes 12345Exploring outcomes 12345
Exploring outcomes 12345
mikewolf702
 
A Large-Scale Informatics Database to Advance Research and Discovery of the E...
A Large-Scale Informatics Database to Advance Research and Discovery of the E...A Large-Scale Informatics Database to Advance Research and Discovery of the E...
A Large-Scale Informatics Database to Advance Research and Discovery of the E...
Jesus J Caban, PhD
 
826 Unertl et al., Describing and Modeling WorkflowResearch .docx
826 Unertl et al., Describing and Modeling WorkflowResearch .docx826 Unertl et al., Describing and Modeling WorkflowResearch .docx
826 Unertl et al., Describing and Modeling WorkflowResearch .docx
evonnehoggarth79783
 
Module 5 (week 9) - InterventionAs you continue to work on your .docx
Module 5 (week 9) - InterventionAs you continue to work on your .docxModule 5 (week 9) - InterventionAs you continue to work on your .docx
Module 5 (week 9) - InterventionAs you continue to work on your .docx
roushhsiu
 
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
Kaela Johnson
 

Similar to Short Story Assignment by Kelly Nguyen (20)

NURS 521 Nursing Informatics And Technology.docx
NURS 521 Nursing Informatics And Technology.docxNURS 521 Nursing Informatics And Technology.docx
NURS 521 Nursing Informatics And Technology.docx
 
readmission paper applied health econ
readmission paper applied health econreadmission paper applied health econ
readmission paper applied health econ
 
INTEGRATING MACHINE LEARNING IN CLINICAL DECISION SUPPORT SYSTEMS
INTEGRATING MACHINE LEARNING IN CLINICAL DECISION SUPPORT SYSTEMSINTEGRATING MACHINE LEARNING IN CLINICAL DECISION SUPPORT SYSTEMS
INTEGRATING MACHINE LEARNING IN CLINICAL DECISION SUPPORT SYSTEMS
 
INTEGRATING MACHINE LEARNING IN CLINICAL DECISION SUPPORT SYSTEMS
INTEGRATING MACHINE LEARNING IN CLINICAL DECISION SUPPORT SYSTEMSINTEGRATING MACHINE LEARNING IN CLINICAL DECISION SUPPORT SYSTEMS
INTEGRATING MACHINE LEARNING IN CLINICAL DECISION SUPPORT SYSTEMS
 
Introduction Healthcare system is considered one of the busiest.pdf
Introduction Healthcare system is considered one of the busiest.pdfIntroduction Healthcare system is considered one of the busiest.pdf
Introduction Healthcare system is considered one of the busiest.pdf
 
Informatics and nursing 2015 2016.odette richards
Informatics and nursing 2015 2016.odette richardsInformatics and nursing 2015 2016.odette richards
Informatics and nursing 2015 2016.odette richards
 
Chapter 4 Knowledge Discovery, Data Mining, and Practice-Based Evi.docx
Chapter 4 Knowledge Discovery, Data Mining, and Practice-Based Evi.docxChapter 4 Knowledge Discovery, Data Mining, and Practice-Based Evi.docx
Chapter 4 Knowledge Discovery, Data Mining, and Practice-Based Evi.docx
 
Real world Evidence and Precision medicine bridging the gap
Real world Evidence and Precision medicine bridging the gapReal world Evidence and Precision medicine bridging the gap
Real world Evidence and Precision medicine bridging the gap
 
By administering assessments and analyzing the results, targeted a
By administering assessments and analyzing the results, targeted aBy administering assessments and analyzing the results, targeted a
By administering assessments and analyzing the results, targeted a
 
Machine Learning Algorithms for Predictive Analytics in Precision Medicine
Machine Learning Algorithms for Predictive Analytics in Precision MedicineMachine Learning Algorithms for Predictive Analytics in Precision Medicine
Machine Learning Algorithms for Predictive Analytics in Precision Medicine
 
Enabling Analytics on Sensitive Medical Data with Secure Multiparty Computation
Enabling Analytics on Sensitive Medical Data with Secure Multiparty ComputationEnabling Analytics on Sensitive Medical Data with Secure Multiparty Computation
Enabling Analytics on Sensitive Medical Data with Secure Multiparty Computation
 
Improving health care outcomes with responsible data science
Improving health care outcomes with responsible data scienceImproving health care outcomes with responsible data science
Improving health care outcomes with responsible data science
 
The Use of Health Information Technology to Improve Care and .docx
The Use of Health Information Technology to Improve Care and .docxThe Use of Health Information Technology to Improve Care and .docx
The Use of Health Information Technology to Improve Care and .docx
 
J1803026569
J1803026569J1803026569
J1803026569
 
Complex Health Data Visualization
Complex Health Data VisualizationComplex Health Data Visualization
Complex Health Data Visualization
 
Exploring outcomes 12345
Exploring outcomes 12345Exploring outcomes 12345
Exploring outcomes 12345
 
A Large-Scale Informatics Database to Advance Research and Discovery of the E...
A Large-Scale Informatics Database to Advance Research and Discovery of the E...A Large-Scale Informatics Database to Advance Research and Discovery of the E...
A Large-Scale Informatics Database to Advance Research and Discovery of the E...
 
826 Unertl et al., Describing and Modeling WorkflowResearch .docx
826 Unertl et al., Describing and Modeling WorkflowResearch .docx826 Unertl et al., Describing and Modeling WorkflowResearch .docx
826 Unertl et al., Describing and Modeling WorkflowResearch .docx
 
Module 5 (week 9) - InterventionAs you continue to work on your .docx
Module 5 (week 9) - InterventionAs you continue to work on your .docxModule 5 (week 9) - InterventionAs you continue to work on your .docx
Module 5 (week 9) - InterventionAs you continue to work on your .docx
 
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
 

Recently uploaded

Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
74nqk8xf
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 

Recently uploaded (20)

Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 

Short Story Assignment by Kelly Nguyen

  • 1. Revolutionizing Patient Data Privacy: Federated Learning & Clustering in Healthcare Deep dive into “Privacy-preserving patient clustering for personalized federated learning” written by Ahmed Elhussein and Gamze Gursoy and presented at the Machine Learning for Healthcare 2023 conference Presented by Kelly Nguyen
  • 2. Research Introduction to Healthcare Data Privacy Deep Learning in Healthcare Challenges in Patient Data Clustering Federated Learning as a Privacy Beacon Innovations of PCBFL Methodology Results and Significance Conclusion Personal Insights 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Overview
  • 3. Research Paper: Privacy-preserving patient clustering for personalized federated learning Written by Ahmed Elhussein and Gamze Gursoy Presented at Machine Learning for Healthcare 2023 Focus on non-IID data and privacy in federated learning
  • 4. In the realm of healthcare, the sanctity of patient data is paramount With the advent of digital record-keeping, the potential for sensitive information to be compromised has escalated, necessitating robust solutions to ensure privacy. The critical need for safeguarding healthcare data cannot be overstated not just a matter of legal compliance, but of maintaining the trust at the very core of the patient-provider relationship. Introduction to Healthcare Data Privacy
  • 5. Enhances disease risk prediction, diagnostic support Relies on large EHR datasets Limitations overfitting and poor generalization in small datasets Deep Learning in Healthcare
  • 6. Challenges in Patient Data Clustering 1 Necessity for personalized FL models 2 Risks of traditional clustering methods 3 Privacy conflicts with data sharing
  • 7. A groundbreaking technique that promises to revolutionize the way medical data is utilized for research and analysis Approach allows for the development of powerful, predictive models by learning from decentralized datasets, without ever compromising the privacy of the individual’s data. By processing data locally and sharing only model updates instead of raw information, federated learning provides a blueprint for harnessing the collective power of healthcare data while upholding the inviolable principle of patient privacy. Federated Learning as a Privacy Beacon
  • 8. Innovations of PCBFL Privacy-preserving Community-Based Federated Learning (PCBFL) Utilizes Secure Multiparty Computation (SMPC) to enable the clustering of patient data without exposing individual patient information
  • 10. In order to create a more realistic FL scenario, a subsampling approach was used, where the size of the dataset in each site was limited. The study utilized the eICU collaborative research database, comprising data from over 200,000 patients across the U.S., to predict ICU mortality using variables like diagnosis, medications, and physical exams within the first 48 hours of admission. Focused primarily on patients with complete data to avoid imputation bias, leading to a cohort of 5,000 patients from 20 sites, aligning with a realistic federated learning scenario with limited site data. Cohort and Feature Extraction
  • 11. Creating Patient Embeddings Estimating Patient Similarity Securely Clustering Patients Predicting Mortality PCBFL Protocol A federated autoencoder is trained to obtain latent variables for each feature domain. SMPC uses a secret sharing scheme to jointly calculate the dot product between pairs of vectors Spectral clustering to cluster the patients using similarity matrix generated from pairwise cosine similarities of embeddings Cluster-based FL training. Each model is separately trained per cluster.
  • 12. Feed Forward models were implemented with ReLU activation in the hidden layers and sigmoid activation in the final output layers. Model Training
  • 13. Analyzed patient clusters formed by Privacy- preserving Community-Based Federated Learning (PCBFL) and Community-Based Federated Learning (CBFL), assessing mortality rates and feature distributions Evaluation The data was divided into training and testing sets at a 70:30 ratio. Given the imbalance in the dataset with only 20% positive labels, performance was evaluated using both AUC and AUPRC scores. The models were run 100 times to calculate mean scores and determine 95% confidence intervals with bootstrap estimation.
  • 14. Results Things that PCBFL provided includes privacy-preserving and accurate patient similarity scores using secure multiparty computation, clinically meaningful clusters, and an increase in predictive performance.
  • 15. All in all, the paper presents a personalized federated learning (FL) framework that integrates data mining techniques like clustering to address data heterogeneity and privacy in healthcare. The study overall underscores PCBFL’s promise for collaborative healthcare data analysis which enhances patient privacy and future research aims to assess its generalizability across different healthcare datasets and domains Conclusion
  • 16. I found this research very interesting because not only does it explore traditional deep mining tactics such as clustering, but it goes into advanced modernized solutions such as federated learning. With personal experience working in biotech, the rise of using data mining and deep learning in healthcare is not only revolutionary, but also extraordinary in improving the quality of healthcare itself. I look forward to reading more research and articles on just that. Personal Insights
  • 17. Thank You For further questions, feel free to email me at kelly.nguyen01@sjsu.edu