Fyp presentation (1)

ARTIFICIAL INTELLIGENCE
DIAGNOSIS
(AID)
Group Members: Supervisor: Dr. Muhammad Sarim
Mahira Akhtar
Aly Akbar Ali Hirji
Hammad Ahmed
Muhammad Hassan Siddiqui
18 December 2017 1

OBJECTIVES
• Assist medical professionals in diagnosis
• Predict probable disease and diagnosis
• Provide personalized healthcare to patients
2

MOTIVATION & BACKGROUND
• Too many patients but very few doctors
• Doctors short on time and overlook details
• Lab tests end up in false diagnosis
• Diagnosis is dependent on Doctor’s mood
3

MOTIVATION & BACKGROUND
• EMR data is not utilized properly
– Patient’s personal information and medical history
not taken in account
– Patients are often prescribed unnecessary tests
• Demographic characteristics ignored
– Existing expert systems do not take them into account
– These account for significant differences in baselines
4

METHODOLOGY
• Extract rules from data provided by UMDC
– This process will make use of Data mining methods
such as Neural Fuzzy learners
• Extract rules from medical literature
– Online repositories such as PubMed, Medscape, and
Wikipedia
– Crawl data from them using web crawlers such as
PHPcrawl
• Take baseline differences in account during rule
generation.
5

METHODOLOGY
● Generated rules will be accessible to doctors
–Through an excel spreadsheet containing results
values of lab tests
–Rules presented in a table with each row
denoting test result parameter values for each
disease
–Doctors could add and edit parameter values and
diseases without need for any programming skills
● The rules will then be converted into XML for
updating the expert system
6

METHODOLOGY
• Ranked list of possible diseases based on rules
and scoring
• Storage and retrieval of previous diagnosis of
patients to improve accuracy of prediction
7

EXTENSIONS
• Use of Symptoms during the prediction
• Medical Analysis based on demographic characteristics such
as gender, residential address etc.
• Integration of expert system with an existing Hospital EMR
• Risk monitoring system to identify patients at risk
9

DATA UNDERSTANDING
▪ The Blood Test Data provided by UMDC contains about 200,000 records
▪ Multiple test of about 54,000 patients
▪ Out of these, diagnosis of only 3000 is recorded
▪ Patient Tests:
10
Test
Code
Test name Normal values range
1 Haemoglobin 11.5 – 18 (mg/dl)
17 Urea 10 – 50 (mg%)
18 Creatinine 0.5 – 1.5 (mg%)
25 Potassium 3.8 – 5.2 (ME q/L)
47 Glucose Fasting 70 – 110 (mg%)
48 Glucose Random 80 – 180 (mg%)

DATA UNDERSTANDING
11
•Actual Data – Test Results Table

DATA UNDERSTANDING
•Actual Data – Vitals Table
12

DATA UNDERSTANDING
•Actual Data – Diagnosis Table
13

DATA UNDERSTANDING
•Correlation Matrix
14

DATA UNDERSTANDING
• Problems with the data
― Multiple diagnosis of patients at the same date and time
― Test codes inconsistent with the test names
e.g. Haemoglobin records are classified under test code 1 and most of the
Glucose (fasting) records are classified under test code 47. However, a few of
the Glucose (fasting) records are misclassified under test code 1
― Some of the test names are not consistent
e.g Haemoglobin test name is recorded as “Haemoglobin”, “Hb”, and
“Haemoglobin %”
― Human Errors in data entry. E.g. Temperature recorded as 980 *F (prob he
was trying to record 98.0)
15

DATA UNDERSTANDING
•Problems with the data
16

DATA UNDERSTANDING
•Problems with the data
– Multiple test results values are recorded against the same registration number and the same
date and time.
17

DATA UNDERSTANDING
–Test Value Inconsistency- above 800 cells found with text such as ‘127 (AFTER GLOCOUSE 01
HR)’ and ‘AFTER 75GRM GLOCOUSE 01HR (92)’
18

DATA UNDERSTANDING
–Test Code and Test Name inconsistency problem solved by Excel formulas such
as:=IF(OR(P2="Haemoglobin %",P2="Hb"),"Haemoglobin",P2)
–And
=IF(N2="true",(MID(L2,SEARCH("(",L2)+1,SEARCH(")",L2,SEARCH("(",L2)+1)-SEARCH("(",L2)-1)),N2)
19

DATA UNDERSTANDING
•Pivot Table: Test Codes as Columns, grouped by date and regno
20

DATA CLEANING
•Handling Missing Values
21

DATA CLEANING
•Binning: Out of range column added for test values using normal ranges
23

DATA CLEANING
•Handling missing values: Since a patient whose test reports are cleared will have normal test range
values. So we handled those missing values by inserting the average of normal test range values
24

CONCLUSION
• Aim to build a Medical Expert System to assist medical
professionals especially doctors in diagnosis
• Want to make medical literature as a direct support for
diagnosis
• Want to allow patients to be provided personalised treatment
using their medical history
• Wish to serve the medical community as Computer Scientists,
considering the field’s interdisciplinary nature
27

Fyp presentation (1)

Recommended

Recommended

More Related Content

Similar to Fyp presentation (1)

Similar to Fyp presentation (1) (20)

More from KayDrive

More from KayDrive (20)

Recently uploaded

Recently uploaded (20)

Fyp presentation (1)