SlideShare a Scribd company logo
電機四 王鴻鈞
Cloud
Data Mining in Health Care System
2014-12-30
1
Summary
- Typical content of a generic Electronic health data (EHR) system
- How data mining on health data fill knowledge gaps and assist informed clinical decision making
- How the integration of EHR and genetic data with systems biology approaches facilitate
genotype–phenotype association studies
2
Classification of Electronic health data (EHR)
• Administrative data
- Data that serve administrative purposes
• Ancillary clinical data
- Provided by laboratories, pharmacies, and radiological and medical imaging
(Another ancillary source of potentially structured data is genotype and sequence data.)
• Clinical text
- Written or dictated clinical narratives
3
Why Integrate Health Care Data
• Correlating Clinical Features
- 有些不同的疾病會有相同的症狀或者會同時出現 (co-occurrence)。
- 把一些常見的現象和一些重要的疾病連結在一起,以發現新病徵。
• Prediction from Data
- 在一些狀況下,我們可以藉由先前所發現的相關性研究或是其他事實來建立
一個醫療判斷模型,藉此提供醫生一個預測病人狀況的參考依據。
• Patient Stratification
- 將患者作分群,通常相同群組會有類似的症狀。
4
Electronic health record content
The electronic health record (EHR) of a patient can be viewed as a repository of information
regarding his or her health status in a computer-readable form. An encounter with the
health-care system generates various types of patient-linked data.
5
Four ways to analyze EHR data
1) Comorbidity 2) Machine Learning 3) Patient Clustering 4) Cohort Querying
6
Comorbidity
7
Patient clustering
8
Classification by Machine Learning
9
Cohort querying
10
Deal with Clinical Text
Using Natural language processing
1. Sentence boundary detection splits the text into units of individual sentences.
2. Split the text using space and punctuation as a guide to identify individual tokens (typically individual words),
with rules for handling special cases such as dates
3. Tokens are reduced to a base form by normalizing
4. Assigns part-of-speech tags to each token to identify its grammatical category in the context
5. identifies syntactic units, most importantly noun phrases (NPs), which are grammatical units, built from a
noun with optional modifiers such as adjectives.
6. NPs and various lexical permutations are then mapped to controlled vocabularies
11
How the System Actually Implement
Take A health care system “GEMINI” for example
The GEMINI system consists of two components:
1. The PROFILING component extracts data of each patient from various ources and stores them as
information in a patient profile graph.
2.The ANALYTICS component analyzes the patient profile graphs to infer implicit information
and extract relevant features for the prediction tasks.
(Whole view of GEMINI)
12
Input of GEMINI
- Clinical Data
The repository has multiple sources of patient data: 1) structured sources containing patients’demographics, lab test results,
medication history, etc., 2) unstructured data sources storing free-text doctor’s notes.
- Medical Knowledge Base
GEMINI utilizes a well-known medical knowledge base UMLS to interpret unstructured doctor’s notes, i.e.,
identifying medical concepts (e.g., diabetes mellitus), and relationships between concepts (e.g., HbA1c measures control of
diabetes mellitus).
Input:
13
How to do “Patient Profiling”
- This component utilizes NLP engines to extract named entities, called mentions. It then devises collective inference to
simultaneously map mentions to their semantically matched concepts in the knowledge base and discovers additional
relationships.
- To improve the accuracy of this process, the component asks doctors to verify or corroborate mention-concept mappings
and concept relationships identified.
14
How to do “Healthcare Analytics”
The ANALYTICS component of GEMINI consists of three major steps:
1) Feature Selection
- All features that are contained in the patient profile graphs can be used as features for the analytics tasks. ANALYTICS can derive
implicit and also important features with expert input from the healthcare professionals.
2) Training Data Labelling
- Leverage on doctors’ input to label a small number of patients with the most informative data (i.e., patient profile graphs) to derive a
training set
- What we need is a diverse set of labeled patients that somehow covers the whole data space as much as possible
- Avoid overwhelming the doctors with too much information
3) Analytics Algorithms
- Conventional analytics algorithms, such as classification, clustering and prediction to perform the various analytics tasks
- Might have some expert rules/heuristics for the analytics tasks ( e.g. majority-voting)
15
How to implement “Supporting Platform”
using ‘epiC”
GEMINI use a flexible parallel processing framework (epiC ) to support:
1) Distributed data storage that effectively partitions clinical data and stores them in multiple nodes.
2) Scalable NLP processing and data analytics that involve various computation models, such as Map-
Reduce model for entity extraction, Pregel model for graphical inference, deep learning for analytics, etc.
16
- Integrating genetics
- Systems biology and gene-network-based decision support
Linking to the molecular level
17
Take Genome Analyzing Startup for Example
18
19
Connect with Data Base
Using “AnnovaR ” as Annotation Engine
20
Connect with Data Base
Using “AnnovaR ” as Annotation Engine
21
Workflows in electronic health record-driven genomic
research
22
Limiting factors — key problems to overcome
- Privacy, autonomy and consent
- Interoperability across institutions, countries and continents
23
Reference
• GEMINI: An Integrative Healthcare Analytics System
• epic: an Extensible and Scalable System for Processing Big Data
• Semantics Driven Approach for Knowledge Acquisition From EMRs
• Mining electronic health record toward better research applications and clinical care
• Using electronic health records to drive discovery in disease genomics
• Contextual Crowd Intelligence
• Opportunities for genomic clinical decision support interventions
• The role of primary care in early detection and follow-up of cancer

More Related Content

What's hot

Machine Learning Misconceptions
Machine Learning MisconceptionsMachine Learning Misconceptions
Machine Learning Misconceptions
Health Catalyst
 
Medical center using Data warehousing
Medical center using Data warehousingMedical center using Data warehousing
Medical center using Data warehousing
Saleem Almaqashi
 

What's hot (20)

The New Health Catalyst 2.0 Platform and Products
The New Health Catalyst 2.0 Platform and ProductsThe New Health Catalyst 2.0 Platform and Products
The New Health Catalyst 2.0 Platform and Products
 
How To Avoid The 3 Most Common Healthcare Analytics Pitfalls And Related Inef...
How To Avoid The 3 Most Common Healthcare Analytics Pitfalls And Related Inef...How To Avoid The 3 Most Common Healthcare Analytics Pitfalls And Related Inef...
How To Avoid The 3 Most Common Healthcare Analytics Pitfalls And Related Inef...
 
DATA-DRIVEN CARE: THE KEY TO ACCOUNTABLE CARE DELIVERY FROM A PHYSICIAN GROUP...
DATA-DRIVEN CARE: THE KEY TO ACCOUNTABLE CARE DELIVERY FROM A PHYSICIAN GROUP...DATA-DRIVEN CARE: THE KEY TO ACCOUNTABLE CARE DELIVERY FROM A PHYSICIAN GROUP...
DATA-DRIVEN CARE: THE KEY TO ACCOUNTABLE CARE DELIVERY FROM A PHYSICIAN GROUP...
 
Healthcare 2.0: The Age of Analytics
Healthcare 2.0: The Age of AnalyticsHealthcare 2.0: The Age of Analytics
Healthcare 2.0: The Age of Analytics
 
Five Strategies for Easing the Burden of Clinical Quality Measures
Five Strategies for Easing the Burden of Clinical Quality MeasuresFive Strategies for Easing the Burden of Clinical Quality Measures
Five Strategies for Easing the Burden of Clinical Quality Measures
 
Using Enterprise Data To Drive Improvement
Using Enterprise Data To Drive ImprovementUsing Enterprise Data To Drive Improvement
Using Enterprise Data To Drive Improvement
 
The 3 Must-Have Qualities of a Care Management System
The 3 Must-Have Qualities of a Care Management SystemThe 3 Must-Have Qualities of a Care Management System
The 3 Must-Have Qualities of a Care Management System
 
Healthcare Data Analytics Implementation
Healthcare Data Analytics ImplementationHealthcare Data Analytics Implementation
Healthcare Data Analytics Implementation
 
Healthcare Interoperability: New Tactics and Technology
Healthcare Interoperability: New Tactics and TechnologyHealthcare Interoperability: New Tactics and Technology
Healthcare Interoperability: New Tactics and Technology
 
Value-Based Purchasing: Four Need-to-Know Domains for 2018
Value-Based Purchasing: Four Need-to-Know Domains for 2018Value-Based Purchasing: Four Need-to-Know Domains for 2018
Value-Based Purchasing: Four Need-to-Know Domains for 2018
 
Improving Clinical and Operational Outcomes by Leveraging Healthcare Data Ana...
Improving Clinical and Operational Outcomes by Leveraging Healthcare Data Ana...Improving Clinical and Operational Outcomes by Leveraging Healthcare Data Ana...
Improving Clinical and Operational Outcomes by Leveraging Healthcare Data Ana...
 
Delivering the Right Insight to the Right Person: How Workflow Automation Opt...
Delivering the Right Insight to the Right Person: How Workflow Automation Opt...Delivering the Right Insight to the Right Person: How Workflow Automation Opt...
Delivering the Right Insight to the Right Person: How Workflow Automation Opt...
 
Build vs. Buy a Healthcare Enterprise Data Warehouse: Which is Best for You?
Build vs. Buy a Healthcare Enterprise Data Warehouse: Which is Best for You?Build vs. Buy a Healthcare Enterprise Data Warehouse: Which is Best for You?
Build vs. Buy a Healthcare Enterprise Data Warehouse: Which is Best for You?
 
Machine Learning Misconceptions
Machine Learning MisconceptionsMachine Learning Misconceptions
Machine Learning Misconceptions
 
Predictive Analytics in Healthcare
Predictive Analytics in HealthcarePredictive Analytics in Healthcare
Predictive Analytics in Healthcare
 
Use of Star Schema in Health Care
Use of Star Schema in Health CareUse of Star Schema in Health Care
Use of Star Schema in Health Care
 
Medical center using Data warehousing
Medical center using Data warehousingMedical center using Data warehousing
Medical center using Data warehousing
 
Three Cost-Saving Strategies to Reduce Healthcare Spending
Three Cost-Saving Strategies to Reduce Healthcare SpendingThree Cost-Saving Strategies to Reduce Healthcare Spending
Three Cost-Saving Strategies to Reduce Healthcare Spending
 
The Future of Personalized Health Care: Predictive Analytics by @Rock_Health
The Future of Personalized Health Care: Predictive Analytics by @Rock_HealthThe Future of Personalized Health Care: Predictive Analytics by @Rock_Health
The Future of Personalized Health Care: Predictive Analytics by @Rock_Health
 
How to Use Text Analytics in Healthcare to Improve Outcomes: Why You Need Mor...
How to Use Text Analytics in Healthcare to Improve Outcomes: Why You Need Mor...How to Use Text Analytics in Healthcare to Improve Outcomes: Why You Need Mor...
How to Use Text Analytics in Healthcare to Improve Outcomes: Why You Need Mor...
 

Viewers also liked

Google File System
Google File SystemGoogle File System
Google File System
nadikari123
 
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Sunil Nair
 
Amazon Item-to-Item Recommendations
Amazon Item-to-Item RecommendationsAmazon Item-to-Item Recommendations
Amazon Item-to-Item Recommendations
Roger Chen
 

Viewers also liked (16)

The Vision of Clinical Data Science
The Vision of Clinical Data ScienceThe Vision of Clinical Data Science
The Vision of Clinical Data Science
 
Research Papers Recommender based on Digital Repositories Metadata
Research Papers Recommender based on Digital Repositories MetadataResearch Papers Recommender based on Digital Repositories Metadata
Research Papers Recommender based on Digital Repositories Metadata
 
Panel at AMIA 2013 Conference on big data - The Exposome and the quantified s...
Panel at AMIA 2013 Conference on big data - The Exposome and the quantified s...Panel at AMIA 2013 Conference on big data - The Exposome and the quantified s...
Panel at AMIA 2013 Conference on big data - The Exposome and the quantified s...
 
Teaching with Google Books: research, copyright, and data mining
Teaching with Google Books: research, copyright, and data miningTeaching with Google Books: research, copyright, and data mining
Teaching with Google Books: research, copyright, and data mining
 
Preservaçao digital de tese e dissertaçoes
Preservaçao digital de tese e dissertaçoesPreservaçao digital de tese e dissertaçoes
Preservaçao digital de tese e dissertaçoes
 
Research in data mining
Research in data miningResearch in data mining
Research in data mining
 
Social Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & AnalysisSocial Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & Analysis
 
Google File System
Google File SystemGoogle File System
Google File System
 
Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...
Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...
Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...
 
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
 
Medical Informatics: Computational Analytics in Healthcare
Medical Informatics: Computational Analytics in HealthcareMedical Informatics: Computational Analytics in Healthcare
Medical Informatics: Computational Analytics in Healthcare
 
Emotion detection from text using data mining and text mining
Emotion detection from text using data mining and text miningEmotion detection from text using data mining and text mining
Emotion detection from text using data mining and text mining
 
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
 
Recommender Systems - A Review and Recent Research Trends
Recommender Systems  -  A Review and Recent Research TrendsRecommender Systems  -  A Review and Recent Research Trends
Recommender Systems - A Review and Recent Research Trends
 
Hadoop HDFS Architeture and Design
Hadoop HDFS Architeture and DesignHadoop HDFS Architeture and Design
Hadoop HDFS Architeture and Design
 
Amazon Item-to-Item Recommendations
Amazon Item-to-Item RecommendationsAmazon Item-to-Item Recommendations
Amazon Item-to-Item Recommendations
 

Similar to Data mining paper survey for Health Care Support System

Glossary of health informatics terms
Glossary of health informatics termsGlossary of health informatics terms
Glossary of health informatics terms
eduardo guagliardi
 
Glossary of health informatics terms
Glossary of health informatics termsGlossary of health informatics terms
Glossary of health informatics terms
eduardo guagliardi
 
Building clinical data warehouse for traditional chinese medicine knowledge...
Building clinical data warehouse for traditional chinese medicine   knowledge...Building clinical data warehouse for traditional chinese medicine   knowledge...
Building clinical data warehouse for traditional chinese medicine knowledge...
nurulbahi
 

Similar to Data mining paper survey for Health Care Support System (20)

Glossary of health informatics terms
Glossary of health informatics termsGlossary of health informatics terms
Glossary of health informatics terms
 
Glossary of health informatics terms
Glossary of health informatics termsGlossary of health informatics terms
Glossary of health informatics terms
 
Building clinical data warehouse for traditional chinese medicine knowledge...
Building clinical data warehouse for traditional chinese medicine   knowledge...Building clinical data warehouse for traditional chinese medicine   knowledge...
Building clinical data warehouse for traditional chinese medicine knowledge...
 
Healthcare information systems
Healthcare information systemsHealthcare information systems
Healthcare information systems
 
Informatics
Informatics Informatics
Informatics
 
Evaluation of a clinical information system (cis)
Evaluation of a clinical information system (cis)Evaluation of a clinical information system (cis)
Evaluation of a clinical information system (cis)
 
Clinical decision support systems
Clinical decision support systemsClinical decision support systems
Clinical decision support systems
 
Data mining and data warehousing
Data mining and data warehousingData mining and data warehousing
Data mining and data warehousing
 
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsEnabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
 
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsEnabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
 
AN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSIS
AN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSISAN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSIS
AN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSIS
 
AN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSIS
AN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSISAN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSIS
AN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSIS
 
Expert Systems vs Clinical Decision Support Systems
Expert Systems vs Clinical Decision Support SystemsExpert Systems vs Clinical Decision Support Systems
Expert Systems vs Clinical Decision Support Systems
 
Ehr challenges [bigdata]
Ehr challenges [bigdata]Ehr challenges [bigdata]
Ehr challenges [bigdata]
 
Clinical Trials Powered By Electronic Health Records
Clinical Trials Powered By Electronic Health RecordsClinical Trials Powered By Electronic Health Records
Clinical Trials Powered By Electronic Health Records
 
Unified Medical Data Platform focused on Accuracy
Unified Medical Data Platform focused on AccuracyUnified Medical Data Platform focused on Accuracy
Unified Medical Data Platform focused on Accuracy
 
PATIENT MANAGEMENT SYSTEM project
PATIENT MANAGEMENT SYSTEM projectPATIENT MANAGEMENT SYSTEM project
PATIENT MANAGEMENT SYSTEM project
 
Utilizing Interoperability Standards to Exchange and Protect Healthcare Data
Utilizing Interoperability Standards to Exchange and Protect Healthcare DataUtilizing Interoperability Standards to Exchange and Protect Healthcare Data
Utilizing Interoperability Standards to Exchange and Protect Healthcare Data
 
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTIONMULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
 
ARTIFICIAL INTELLIGENCE BASED DATA GOVERNANCE FOR CHINESE ELECTRONIC HEALTH R...
ARTIFICIAL INTELLIGENCE BASED DATA GOVERNANCE FOR CHINESE ELECTRONIC HEALTH R...ARTIFICIAL INTELLIGENCE BASED DATA GOVERNANCE FOR CHINESE ELECTRONIC HEALTH R...
ARTIFICIAL INTELLIGENCE BASED DATA GOVERNANCE FOR CHINESE ELECTRONIC HEALTH R...
 

Recently uploaded

Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
PirithiRaju
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
Michel Dumontier
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
muralinath2
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
sreddyrahul
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
muralinath2
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
Jocelyn Atis
 

Recently uploaded (20)

Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
 
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
SAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniquesSAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniques
 
GBSN - Microbiology (Lab 1) Microbiology Lab Safety Procedures
GBSN -  Microbiology (Lab  1) Microbiology Lab Safety ProceduresGBSN -  Microbiology (Lab  1) Microbiology Lab Safety Procedures
GBSN - Microbiology (Lab 1) Microbiology Lab Safety Procedures
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
 
GBSN - Microbiology (Lab 2) Compound Microscope
GBSN - Microbiology (Lab 2) Compound MicroscopeGBSN - Microbiology (Lab 2) Compound Microscope
GBSN - Microbiology (Lab 2) Compound Microscope
 
Hemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. MuralinathHemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. Muralinath
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
 
biotech-regenration of plants, pharmaceutical applications.pptx
biotech-regenration of plants, pharmaceutical applications.pptxbiotech-regenration of plants, pharmaceutical applications.pptx
biotech-regenration of plants, pharmaceutical applications.pptx
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
mixotrophy in cyanobacteria: a dual nutritional strategy
mixotrophy in cyanobacteria: a dual nutritional strategymixotrophy in cyanobacteria: a dual nutritional strategy
mixotrophy in cyanobacteria: a dual nutritional strategy
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 

Data mining paper survey for Health Care Support System

  • 1. 電機四 王鴻鈞 Cloud Data Mining in Health Care System 2014-12-30
  • 2. 1 Summary - Typical content of a generic Electronic health data (EHR) system - How data mining on health data fill knowledge gaps and assist informed clinical decision making - How the integration of EHR and genetic data with systems biology approaches facilitate genotype–phenotype association studies
  • 3. 2 Classification of Electronic health data (EHR) • Administrative data - Data that serve administrative purposes • Ancillary clinical data - Provided by laboratories, pharmacies, and radiological and medical imaging (Another ancillary source of potentially structured data is genotype and sequence data.) • Clinical text - Written or dictated clinical narratives
  • 4. 3 Why Integrate Health Care Data • Correlating Clinical Features - 有些不同的疾病會有相同的症狀或者會同時出現 (co-occurrence)。 - 把一些常見的現象和一些重要的疾病連結在一起,以發現新病徵。 • Prediction from Data - 在一些狀況下,我們可以藉由先前所發現的相關性研究或是其他事實來建立 一個醫療判斷模型,藉此提供醫生一個預測病人狀況的參考依據。 • Patient Stratification - 將患者作分群,通常相同群組會有類似的症狀。
  • 5. 4 Electronic health record content The electronic health record (EHR) of a patient can be viewed as a repository of information regarding his or her health status in a computer-readable form. An encounter with the health-care system generates various types of patient-linked data.
  • 6. 5 Four ways to analyze EHR data 1) Comorbidity 2) Machine Learning 3) Patient Clustering 4) Cohort Querying
  • 11. 10 Deal with Clinical Text Using Natural language processing 1. Sentence boundary detection splits the text into units of individual sentences. 2. Split the text using space and punctuation as a guide to identify individual tokens (typically individual words), with rules for handling special cases such as dates 3. Tokens are reduced to a base form by normalizing 4. Assigns part-of-speech tags to each token to identify its grammatical category in the context 5. identifies syntactic units, most importantly noun phrases (NPs), which are grammatical units, built from a noun with optional modifiers such as adjectives. 6. NPs and various lexical permutations are then mapped to controlled vocabularies
  • 12. 11 How the System Actually Implement Take A health care system “GEMINI” for example The GEMINI system consists of two components: 1. The PROFILING component extracts data of each patient from various ources and stores them as information in a patient profile graph. 2.The ANALYTICS component analyzes the patient profile graphs to infer implicit information and extract relevant features for the prediction tasks. (Whole view of GEMINI)
  • 13. 12 Input of GEMINI - Clinical Data The repository has multiple sources of patient data: 1) structured sources containing patients’demographics, lab test results, medication history, etc., 2) unstructured data sources storing free-text doctor’s notes. - Medical Knowledge Base GEMINI utilizes a well-known medical knowledge base UMLS to interpret unstructured doctor’s notes, i.e., identifying medical concepts (e.g., diabetes mellitus), and relationships between concepts (e.g., HbA1c measures control of diabetes mellitus). Input:
  • 14. 13 How to do “Patient Profiling” - This component utilizes NLP engines to extract named entities, called mentions. It then devises collective inference to simultaneously map mentions to their semantically matched concepts in the knowledge base and discovers additional relationships. - To improve the accuracy of this process, the component asks doctors to verify or corroborate mention-concept mappings and concept relationships identified.
  • 15. 14 How to do “Healthcare Analytics” The ANALYTICS component of GEMINI consists of three major steps: 1) Feature Selection - All features that are contained in the patient profile graphs can be used as features for the analytics tasks. ANALYTICS can derive implicit and also important features with expert input from the healthcare professionals. 2) Training Data Labelling - Leverage on doctors’ input to label a small number of patients with the most informative data (i.e., patient profile graphs) to derive a training set - What we need is a diverse set of labeled patients that somehow covers the whole data space as much as possible - Avoid overwhelming the doctors with too much information 3) Analytics Algorithms - Conventional analytics algorithms, such as classification, clustering and prediction to perform the various analytics tasks - Might have some expert rules/heuristics for the analytics tasks ( e.g. majority-voting)
  • 16. 15 How to implement “Supporting Platform” using ‘epiC” GEMINI use a flexible parallel processing framework (epiC ) to support: 1) Distributed data storage that effectively partitions clinical data and stores them in multiple nodes. 2) Scalable NLP processing and data analytics that involve various computation models, such as Map- Reduce model for entity extraction, Pregel model for graphical inference, deep learning for analytics, etc.
  • 17. 16 - Integrating genetics - Systems biology and gene-network-based decision support Linking to the molecular level
  • 18. 17 Take Genome Analyzing Startup for Example
  • 19. 18
  • 20. 19 Connect with Data Base Using “AnnovaR ” as Annotation Engine
  • 21. 20 Connect with Data Base Using “AnnovaR ” as Annotation Engine
  • 22. 21 Workflows in electronic health record-driven genomic research
  • 23. 22 Limiting factors — key problems to overcome - Privacy, autonomy and consent - Interoperability across institutions, countries and continents
  • 24. 23 Reference • GEMINI: An Integrative Healthcare Analytics System • epic: an Extensible and Scalable System for Processing Big Data • Semantics Driven Approach for Knowledge Acquisition From EMRs • Mining electronic health record toward better research applications and clinical care • Using electronic health records to drive discovery in disease genomics • Contextual Crowd Intelligence • Opportunities for genomic clinical decision support interventions • The role of primary care in early detection and follow-up of cancer