SlideShare a Scribd company logo
1 of 29
ARTIFICIAL INTELLIGENCE
DIAGNOSIS
(AID)
Group Members: Supervisor: Dr. Muhammad Sarim
Mahira Akhtar
Aly Akbar Ali Hirji
Hammad Ahmed
Muhammad Hassan Siddiqui
18 December 2017 1
OBJECTIVES
• Assist medical professionals in diagnosis
• Predict probable disease and diagnosis
• Provide personalized healthcare to patients
2
MOTIVATION & BACKGROUND
• Too many patients but very few doctors
• Doctors short on time and overlook details
• Lab tests end up in false diagnosis
• Diagnosis is dependent on Doctor’s mood
3
MOTIVATION & BACKGROUND
• EMR data is not utilized properly
– Patient’s personal information and medical history
not taken in account
– Patients are often prescribed unnecessary tests
• Demographic characteristics ignored
– Existing expert systems do not take them into account
– These account for significant differences in baselines
4
METHODOLOGY
• Extract rules from data provided by UMDC
– This process will make use of Data mining methods
such as Neural Fuzzy learners
• Extract rules from medical literature
– Online repositories such as PubMed, Medscape, and
Wikipedia
– Crawl data from them using web crawlers such as
PHPcrawl
• Take baseline differences in account during rule
generation.
5
METHODOLOGY
● Generated rules will be accessible to doctors
–Through an excel spreadsheet containing results
values of lab tests
–Rules presented in a table with each row
denoting test result parameter values for each
disease
–Doctors could add and edit parameter values and
diseases without need for any programming skills
● The rules will then be converted into XML for
updating the expert system
6
METHODOLOGY
• Ranked list of possible diseases based on rules
and scoring
• Storage and retrieval of previous diagnosis of
patients to improve accuracy of prediction
7
SYSTEM DESIGN OVERVIEW
8
EXTENSIONS
• Use of Symptoms during the prediction
• Medical Analysis based on demographic characteristics such
as gender, residential address etc.
• Integration of expert system with an existing Hospital EMR
• Risk monitoring system to identify patients at risk
9
DATA UNDERSTANDING
▪ The Blood Test Data provided by UMDC contains about 200,000 records
▪ Multiple test of about 54,000 patients
▪ Out of these, diagnosis of only 3000 is recorded
▪ Patient Tests:
10
Test
Code
Test name Normal values range
1 Haemoglobin 11.5 – 18 (mg/dl)
17 Urea 10 – 50 (mg%)
18 Creatinine 0.5 – 1.5 (mg%)
25 Potassium 3.8 – 5.2 (ME q/L)
47 Glucose Fasting 70 – 110 (mg%)
48 Glucose Random 80 – 180 (mg%)
DATA UNDERSTANDING
11
•Actual Data – Test Results Table
DATA UNDERSTANDING
•Actual Data – Vitals Table
12
DATA UNDERSTANDING
•Actual Data – Diagnosis Table
13
DATA UNDERSTANDING
•Correlation Matrix
14
DATA UNDERSTANDING
• Problems with the data
― Multiple diagnosis of patients at the same date and time
― Test codes inconsistent with the test names
e.g. Haemoglobin records are classified under test code 1 and most of the
Glucose (fasting) records are classified under test code 47. However, a few of
the Glucose (fasting) records are misclassified under test code 1
― Some of the test names are not consistent
e.g Haemoglobin test name is recorded as “Haemoglobin”, “Hb”, and
“Haemoglobin %”
― Human Errors in data entry. E.g. Temperature recorded as 980 *F (prob he
was trying to record 98.0)
15
DATA UNDERSTANDING
•Problems with the data
16
DATA UNDERSTANDING
•Problems with the data
– Multiple test results values are recorded against the same registration number and the same
date and time.
17
DATA UNDERSTANDING
–Test Value Inconsistency- above 800 cells found with text such as ‘127 (AFTER GLOCOUSE 01
HR)’ and ‘AFTER 75GRM GLOCOUSE 01HR (92)’
18
DATA UNDERSTANDING
–Test Code and Test Name inconsistency problem solved by Excel formulas such
as:=IF(OR(P2="Haemoglobin %",P2="Hb"),"Haemoglobin",P2)
–And
=IF(N2="true",(MID(L2,SEARCH("(",L2)+1,SEARCH(")",L2,SEARCH("(",L2)+1)-SEARCH("(",L2)-1)),N2)
19
DATA UNDERSTANDING
•Pivot Table: Test Codes as Columns, grouped by date and regno
20
DATA CLEANING
•Handling Missing Values
21
MODELLING – Naïve Bayes
22
DATA CLEANING
•Binning: Out of range column added for test values using normal ranges
23
DATA CLEANING
•Handling missing values: Since a patient whose test reports are cleared will have normal test range
values. So we handled those missing values by inserting the average of normal test range values
24
KNIME Workflows
25
KNIME Workflows
26
MODEL AND PREDICTION
•Multinomial Naïve Bayes
27
CONCLUSION
• Aim to build a Medical Expert System to assist medical
professionals especially doctors in diagnosis
• Want to make medical literature as a direct support for
diagnosis
• Want to allow patients to be provided personalised treatment
using their medical history
• Wish to serve the medical community as Computer Scientists,
considering the field’s interdisciplinary nature
28
QUESTIONS?
29

More Related Content

Similar to Fyp presentation (1) (1)

AI for Precision Medicine (Pragmatic preclinical data science)
AI for Precision Medicine (Pragmatic preclinical data science)AI for Precision Medicine (Pragmatic preclinical data science)
AI for Precision Medicine (Pragmatic preclinical data science)Paul Agapow
 
1Big Data Analytics forHealthcareChandan K. ReddyD.docx
1Big Data Analytics forHealthcareChandan K. ReddyD.docx1Big Data Analytics forHealthcareChandan K. ReddyD.docx
1Big Data Analytics forHealthcareChandan K. ReddyD.docxaulasnilda
 
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...Perficient, Inc.
 
6 Epilepsy Use Cases for NLP
6 Epilepsy Use Cases for NLP6 Epilepsy Use Cases for NLP
6 Epilepsy Use Cases for NLPCitiusTech
 
MedDRA Coding - Katalyst HLS
MedDRA Coding - Katalyst HLSMedDRA Coding - Katalyst HLS
MedDRA Coding - Katalyst HLSKatalyst HLS
 
6-Presentation shared.pdf
6-Presentation shared.pdf6-Presentation shared.pdf
6-Presentation shared.pdfRehamAbuzaid1
 
Clinical Validation of Copy Number Variants Using the AMP Guidelines
Clinical Validation of Copy Number Variants Using the AMP GuidelinesClinical Validation of Copy Number Variants Using the AMP Guidelines
Clinical Validation of Copy Number Variants Using the AMP GuidelinesGolden Helix
 
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsEnabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsLuis Marco Ruiz
 
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsEnabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsLuis Marco Ruiz
 
Predictive Modeling: White Paper
Predictive Modeling: White PaperPredictive Modeling: White Paper
Predictive Modeling: White PaperYashi Sarbhai
 
BioGears Overview for SSIH Healthcare Systems Modeling & Simulation Affinity ...
BioGears Overview for SSIH Healthcare Systems Modeling & Simulation Affinity ...BioGears Overview for SSIH Healthcare Systems Modeling & Simulation Affinity ...
BioGears Overview for SSIH Healthcare Systems Modeling & Simulation Affinity ...BioGearsEngine
 
predictionofheartdiseaseusingmachinelearning.pdf
predictionofheartdiseaseusingmachinelearning.pdfpredictionofheartdiseaseusingmachinelearning.pdf
predictionofheartdiseaseusingmachinelearning.pdfDasariSeshadri
 
Prediction of heart disease using machine learning.pptx
Prediction of heart disease using machine learning.pptxPrediction of heart disease using machine learning.pptx
Prediction of heart disease using machine learning.pptxkumari36
 
The End of the Drug Development Casino?
The End of the Drug Development Casino?The End of the Drug Development Casino?
The End of the Drug Development Casino?Paul Agapow
 
Natural Language Processing to Curate Unstructured Electronic Health Records
Natural Language Processing to Curate Unstructured Electronic Health RecordsNatural Language Processing to Curate Unstructured Electronic Health Records
Natural Language Processing to Curate Unstructured Electronic Health RecordsMMS Holdings
 
Readmission of Diabetes Patients Report
Readmission of Diabetes Patients ReportReadmission of Diabetes Patients Report
Readmission of Diabetes Patients ReportHong Lu
 
Data-driven Disease Phenotyping and Bulk Learning
Data-driven Disease Phenotyping and Bulk LearningData-driven Disease Phenotyping and Bulk Learning
Data-driven Disease Phenotyping and Bulk LearningPo-Hsiang (Barnett) Chiu
 
Research team meeting mrs
Research team meeting mrsResearch team meeting mrs
Research team meeting mrsMarion Sills
 

Similar to Fyp presentation (1) (1) (20)

AI for Precision Medicine (Pragmatic preclinical data science)
AI for Precision Medicine (Pragmatic preclinical data science)AI for Precision Medicine (Pragmatic preclinical data science)
AI for Precision Medicine (Pragmatic preclinical data science)
 
1Big Data Analytics forHealthcareChandan K. ReddyD.docx
1Big Data Analytics forHealthcareChandan K. ReddyD.docx1Big Data Analytics forHealthcareChandan K. ReddyD.docx
1Big Data Analytics forHealthcareChandan K. ReddyD.docx
 
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
 
6 Epilepsy Use Cases for NLP
6 Epilepsy Use Cases for NLP6 Epilepsy Use Cases for NLP
6 Epilepsy Use Cases for NLP
 
MedDRA Coding - Katalyst HLS
MedDRA Coding - Katalyst HLSMedDRA Coding - Katalyst HLS
MedDRA Coding - Katalyst HLS
 
6-Presentation shared.pdf
6-Presentation shared.pdf6-Presentation shared.pdf
6-Presentation shared.pdf
 
Clinical Validation of Copy Number Variants Using the AMP Guidelines
Clinical Validation of Copy Number Variants Using the AMP GuidelinesClinical Validation of Copy Number Variants Using the AMP Guidelines
Clinical Validation of Copy Number Variants Using the AMP Guidelines
 
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsEnabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
 
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsEnabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
 
Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)
 
Predictive Modeling: White Paper
Predictive Modeling: White PaperPredictive Modeling: White Paper
Predictive Modeling: White Paper
 
BioGears Overview for SSIH Healthcare Systems Modeling & Simulation Affinity ...
BioGears Overview for SSIH Healthcare Systems Modeling & Simulation Affinity ...BioGears Overview for SSIH Healthcare Systems Modeling & Simulation Affinity ...
BioGears Overview for SSIH Healthcare Systems Modeling & Simulation Affinity ...
 
predictionofheartdiseaseusingmachinelearning.pdf
predictionofheartdiseaseusingmachinelearning.pdfpredictionofheartdiseaseusingmachinelearning.pdf
predictionofheartdiseaseusingmachinelearning.pdf
 
Prediction of heart disease using machine learning.pptx
Prediction of heart disease using machine learning.pptxPrediction of heart disease using machine learning.pptx
Prediction of heart disease using machine learning.pptx
 
The End of the Drug Development Casino?
The End of the Drug Development Casino?The End of the Drug Development Casino?
The End of the Drug Development Casino?
 
Natural Language Processing to Curate Unstructured Electronic Health Records
Natural Language Processing to Curate Unstructured Electronic Health RecordsNatural Language Processing to Curate Unstructured Electronic Health Records
Natural Language Processing to Curate Unstructured Electronic Health Records
 
Final_Presentation.pptx
Final_Presentation.pptxFinal_Presentation.pptx
Final_Presentation.pptx
 
Readmission of Diabetes Patients Report
Readmission of Diabetes Patients ReportReadmission of Diabetes Patients Report
Readmission of Diabetes Patients Report
 
Data-driven Disease Phenotyping and Bulk Learning
Data-driven Disease Phenotyping and Bulk LearningData-driven Disease Phenotyping and Bulk Learning
Data-driven Disease Phenotyping and Bulk Learning
 
Research team meeting mrs
Research team meeting mrsResearch team meeting mrs
Research team meeting mrs
 

Recently uploaded

Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 

Recently uploaded (20)

Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

Fyp presentation (1) (1)

  • 1. ARTIFICIAL INTELLIGENCE DIAGNOSIS (AID) Group Members: Supervisor: Dr. Muhammad Sarim Mahira Akhtar Aly Akbar Ali Hirji Hammad Ahmed Muhammad Hassan Siddiqui 18 December 2017 1
  • 2. OBJECTIVES • Assist medical professionals in diagnosis • Predict probable disease and diagnosis • Provide personalized healthcare to patients 2
  • 3. MOTIVATION & BACKGROUND • Too many patients but very few doctors • Doctors short on time and overlook details • Lab tests end up in false diagnosis • Diagnosis is dependent on Doctor’s mood 3
  • 4. MOTIVATION & BACKGROUND • EMR data is not utilized properly – Patient’s personal information and medical history not taken in account – Patients are often prescribed unnecessary tests • Demographic characteristics ignored – Existing expert systems do not take them into account – These account for significant differences in baselines 4
  • 5. METHODOLOGY • Extract rules from data provided by UMDC – This process will make use of Data mining methods such as Neural Fuzzy learners • Extract rules from medical literature – Online repositories such as PubMed, Medscape, and Wikipedia – Crawl data from them using web crawlers such as PHPcrawl • Take baseline differences in account during rule generation. 5
  • 6. METHODOLOGY ● Generated rules will be accessible to doctors –Through an excel spreadsheet containing results values of lab tests –Rules presented in a table with each row denoting test result parameter values for each disease –Doctors could add and edit parameter values and diseases without need for any programming skills ● The rules will then be converted into XML for updating the expert system 6
  • 7. METHODOLOGY • Ranked list of possible diseases based on rules and scoring • Storage and retrieval of previous diagnosis of patients to improve accuracy of prediction 7
  • 9. EXTENSIONS • Use of Symptoms during the prediction • Medical Analysis based on demographic characteristics such as gender, residential address etc. • Integration of expert system with an existing Hospital EMR • Risk monitoring system to identify patients at risk 9
  • 10. DATA UNDERSTANDING ▪ The Blood Test Data provided by UMDC contains about 200,000 records ▪ Multiple test of about 54,000 patients ▪ Out of these, diagnosis of only 3000 is recorded ▪ Patient Tests: 10 Test Code Test name Normal values range 1 Haemoglobin 11.5 – 18 (mg/dl) 17 Urea 10 – 50 (mg%) 18 Creatinine 0.5 – 1.5 (mg%) 25 Potassium 3.8 – 5.2 (ME q/L) 47 Glucose Fasting 70 – 110 (mg%) 48 Glucose Random 80 – 180 (mg%)
  • 11. DATA UNDERSTANDING 11 •Actual Data – Test Results Table
  • 12. DATA UNDERSTANDING •Actual Data – Vitals Table 12
  • 13. DATA UNDERSTANDING •Actual Data – Diagnosis Table 13
  • 15. DATA UNDERSTANDING • Problems with the data ― Multiple diagnosis of patients at the same date and time ― Test codes inconsistent with the test names e.g. Haemoglobin records are classified under test code 1 and most of the Glucose (fasting) records are classified under test code 47. However, a few of the Glucose (fasting) records are misclassified under test code 1 ― Some of the test names are not consistent e.g Haemoglobin test name is recorded as “Haemoglobin”, “Hb”, and “Haemoglobin %” ― Human Errors in data entry. E.g. Temperature recorded as 980 *F (prob he was trying to record 98.0) 15
  • 17. DATA UNDERSTANDING •Problems with the data – Multiple test results values are recorded against the same registration number and the same date and time. 17
  • 18. DATA UNDERSTANDING –Test Value Inconsistency- above 800 cells found with text such as ‘127 (AFTER GLOCOUSE 01 HR)’ and ‘AFTER 75GRM GLOCOUSE 01HR (92)’ 18
  • 19. DATA UNDERSTANDING –Test Code and Test Name inconsistency problem solved by Excel formulas such as:=IF(OR(P2="Haemoglobin %",P2="Hb"),"Haemoglobin",P2) –And =IF(N2="true",(MID(L2,SEARCH("(",L2)+1,SEARCH(")",L2,SEARCH("(",L2)+1)-SEARCH("(",L2)-1)),N2) 19
  • 20. DATA UNDERSTANDING •Pivot Table: Test Codes as Columns, grouped by date and regno 20
  • 23. DATA CLEANING •Binning: Out of range column added for test values using normal ranges 23
  • 24. DATA CLEANING •Handling missing values: Since a patient whose test reports are cleared will have normal test range values. So we handled those missing values by inserting the average of normal test range values 24
  • 28. CONCLUSION • Aim to build a Medical Expert System to assist medical professionals especially doctors in diagnosis • Want to make medical literature as a direct support for diagnosis • Want to allow patients to be provided personalised treatment using their medical history • Wish to serve the medical community as Computer Scientists, considering the field’s interdisciplinary nature 28