Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
PREDICTIVE ANALYSIS OF
HEALTH RECORDS
Umang Shukla
Pranay Sharma
Krishnan Iyer
Monica
➜ About dataset
➜ What is fuzzy logic
➜ Basics of fuzzy
➜ Work done
➜ Observations
OVERVIEW
ABOUT THE DATASET
Pima Indians Diabetes Database
Sources : UCI Machine Learning Repository
Owners : National Institute of ...
What is Fuzzy Logic ?
Fuzzy logic is an approach to computing based on "degrees of truth"
rather than the usual "true or f...
LET’S REVIEW SOME CONCEPTS
Fuzzy Sets
Let X be a non empty set, A fuzzy set A
in X is characterized by its membership
func...
WORK
DONE
1.
DATA CLEANING
2.
SPLITTING OF
DATASET
3.
BINNING OF
TRAINING SET
4.
RULE GENERATION
USING J48
Classification tree by J48
5.
FUZZIFICATION OF
BINNED INPUT
AND OUTPUT
Fuzzy Inference System
6.
DEFINING FUZZY
RULE
Fuzzified Rules
7.
TEST SET
EVALUATION
EVALUATION
TECHNIQUES
INTERACTIVE
This can be done by evalfis function on matlab
output= evalfis(input,fismat)
Evalfis() has the following arguments:
➜ inpu...
RESULTS AND
OBSERVATION
Before understanding the results we need to know
about the trapezoidal shaped member function which
we used to define inpu...
In our test dataset we had 332 instances.
We evaluated our FIS model for 5%, 10%, 15% and 20%
variance of the a, b, c, d p...
OBSERVATIONS
MODEL ACCURACY
J48 74.14%
DEFAULT(all with 5% variance) 80.722%
10% variance in plasma 81.024%
15% variance i...
OBSERVATIONS
We observed that fuzzy system performs better than our J48 for
same classification model as J48 uses crisp da...
Thanks!
Any questions?
Upcoming SlideShare
Loading in …5
×
Upcoming SlideShare
Predictive analysis of Bihar election 2015
Next
Download to read offline and view in fullscreen.

2

Share

Download to read offline

Predictive Analysis of Health Records using MATLAB

Download to read offline

Predictive Analysis of Health Records using MATLAB's fuzzy toolbox.
The dataset is Pima Indians Diabetes Database from UCI Machine learning repository

Predictive Analysis of Health Records using MATLAB

  1. 1. PREDICTIVE ANALYSIS OF HEALTH RECORDS Umang Shukla Pranay Sharma Krishnan Iyer Monica
  2. 2. ➜ About dataset ➜ What is fuzzy logic ➜ Basics of fuzzy ➜ Work done ➜ Observations OVERVIEW
  3. 3. ABOUT THE DATASET Pima Indians Diabetes Database Sources : UCI Machine Learning Repository Owners : National Institute of Diabetes and Digestive and Kidney Diseases Donor : Vincent Sigillito Date Received : 9th May 1990 Patients are females of age greater than 21 years of Pima Indian Heritage. Number of Instances: 768 Number of Attributes: 8 These attributes are : 1. Number of times pregnant 2. Plasma glucose concentration a 2 hours in an oral glucose tolerance test 3. Diastolic blood pressure (mm Hg) 4. Triceps skin fold thickness (mm) 5. 2-Hour serum insulin (mu U/ml) 6. Body mass index (weight in kg/(height in m)^2) 7. Diabetes pedigree function 8. Age (years) 9. Class variable (0 or 1)
  4. 4. What is Fuzzy Logic ? Fuzzy logic is an approach to computing based on "degrees of truth" rather than the usual "true or false" boolean logic It was first advanced by Dr. Lotfi Zadeh of the University of California He said any logical system could be fuzzified.
  5. 5. LET’S REVIEW SOME CONCEPTS Fuzzy Sets Let X be a non empty set, A fuzzy set A in X is characterized by its membership function µA: X -> [0,1], where µA(x) is the degree of membership of element x in fuzzy set A for each x ∈ X . Operations Union Intersection Complement Comtainment Membership Function They map elements of a fuzzy set to real numbered values in the interval 0 to 1. Example:- Triangular, Trapezoidal, S- shaped, Sigmoid, Pi-function Fuzzification The process of transforming crisp (bivalued) input values into linguistic values is called fuzzification Defuzzification Defuzzification converts the fuzzy values into crisp (bivalued) value. Types :- Max-membership method Centroid method Weighted average method
  6. 6. WORK DONE
  7. 7. 1. DATA CLEANING
  8. 8. 2. SPLITTING OF DATASET
  9. 9. 3. BINNING OF TRAINING SET
  10. 10. 4. RULE GENERATION USING J48
  11. 11. Classification tree by J48
  12. 12. 5. FUZZIFICATION OF BINNED INPUT AND OUTPUT
  13. 13. Fuzzy Inference System
  14. 14. 6. DEFINING FUZZY RULE
  15. 15. Fuzzified Rules
  16. 16. 7. TEST SET EVALUATION
  17. 17. EVALUATION TECHNIQUES INTERACTIVE
  18. 18. This can be done by evalfis function on matlab output= evalfis(input,fismat) Evalfis() has the following arguments: ➜ input: a number or a matrix specifying input values. ➜ fismat: an FIS structure to be evaluated. ON MATLAB TERMINAL EVALUATION TECHNIQUES
  19. 19. RESULTS AND OBSERVATION
  20. 20. Before understanding the results we need to know about the trapezoidal shaped member function which we used to define input variable. tramf = f(x,a,b,c,d) OBSERVATIONS
  21. 21. In our test dataset we had 332 instances. We evaluated our FIS model for 5%, 10%, 15% and 20% variance of the a, b, c, d point for each input member. Next we took 0.65 as our membership value for output variable to classify predictions as “yes” or “no”. However, It was observed that none of these changes in input variable boundary affected the accuracy of the predictions with exception to changes done in the member “plasma”. On digging deeper we found out the reason for such a behaviour, we observed that even though accuracy was not changing these variance indeed affect the membership value of output but none of were big enough to cross the 6.5 barrier which we had set for output classification. OBSERVATIONS
  22. 22. OBSERVATIONS MODEL ACCURACY J48 74.14% DEFAULT(all with 5% variance) 80.722% 10% variance in plasma 81.024% 15% variance in plasma 81.626% 20% variance in plasma 81.626%
  23. 23. OBSERVATIONS We observed that fuzzy system performs better than our J48 for same classification model as J48 uses crisp data values. As only plasma was affecting the accuracy we found that it was so because of plasma was involved in all the rules defined above. As we increased the input variance of plasma the accuracy showed an increase but only upto a particular level.
  24. 24. Thanks! Any questions?
  • surapanenij

    Nov. 26, 2019
  • eseeesee

    Jun. 21, 2016

Predictive Analysis of Health Records using MATLAB's fuzzy toolbox. The dataset is Pima Indians Diabetes Database from UCI Machine learning repository

Views

Total views

1,141

On Slideshare

0

From embeds

0

Number of embeds

14

Actions

Downloads

16

Shares

0

Comments

0

Likes

2

×