Rough Set based Decision Tree for Identifying Vulnerable and Food Insecure Households
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Rough Set based Decision Tree for Identifying Vulnerable and Food Insecure Households

on

  • 845 views

Authors: Rajni Jain, S. Minz and P. Adhiguru

Authors: Rajni Jain, S. Minz and P. Adhiguru
Organizations: NCAP and Jawaharlal Nehru University

Statistics

Views

Total Views
845
Views on SlideShare
845
Embed Views
0

Actions

Likes
0
Downloads
10
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Rough Set based Decision Tree for Identifying Vulnerable and Food Insecure Households Presentation Transcript

  • 1. Rough Set based Decision Treefor Identifying Vulnerable andFood Insecure HouseholdsRajni Jain1, S. Minz2 and P. Adhiguru11Sr. Scientist, NCAP, Pusa, New Delhi2Associate Professor, Jawaharlal Nehru University
  • 2. !
  • 3. " #$ %&! % % &$ % % % % &
  • 4. Pre-processed Data Selection Preprocessing Target Data Data Transformation•Selection phase defines KDD problem by focusing on asubset of data attributes or data samples on which KDD isto be performed.•Preprocessing care to be taken not to induce anyunwanted bias. They include removing noise and missing Tranformed Datadata handling•Transformations may be combining attributes or Data Miningdiscretizing continuous attributes•In Data Mining step many different learning andmodeling algorithms are potential candidates Interpretation Knowledge Patterns
  • 5. ()
  • 6. Step ITraining Data Classification Algorithm Rules/Tree/Formula Step IIEstimate the predictive accuracy of the model. If acceptableStep III Step III Classification Rules Label the class New Data
  • 7. % %$
  • 8. * $ $ + )$ %
  • 9. * , - %./&0
  • 10. 12 3 0# &4& # 5 < ") 1 ;5 6 7 , # : &- / 95 8
  • 11. 5Q: set of attributes of an information system S,P⊆QU: Universe,x and y : two objects in the universef(x,a): value of the attribute a for the object x INDs( P) = {( x, y) ∈UXU : f ( x, a) = f ( y, a)∀a ∈ P}IND(P) is an equivalence relationPartitions U (Universe) into equivalence classesU/IND(P) : Set of partitions
  • 12. 5 6 &&= 5 , ?@ 1# 8A ;# 9A >$ <- @ /# @ :# A Flu Patients Id H M T F= 5 , ?@ 1# :# #/# A >$ "- @ ;# 8A@ 9A 1 n y h y 2 y n h yPX = ∪{Y ∈U / IND( P) : Y ⊆ X } 3 y y vh yPX = ∪{Y ∈U / IND P) : Y ∩ X ≠ φ} 4 ( n y n n 5 y n h n BNp ( X ) = P X − P X 6 n y vh y
  • 13. )% % "* 6B @<# #A!C= 5 , - @ # ;# # :A@ #8A >$ ?@ 1A @ 9A @ #/A@ AD 1# :# ?@ ;# 8A "? * ) % & , ? -? @ :# % " 1# 8A Boundary = % ) % & , ? -? @ ;# 9# % % " 1# :# 8A Region E , -? @ 9A $ ;# F=yD /# ?@ 9A "? * , ? -? @ " /A {{1},{3},{6} = % , ? -? @ /# % " ;# 9A } E , -? @ 9A $ ;# F=y/n {{2, 5}} {{4}} | P(X ) | α P (X ) = | P(X ) |
  • 14. 1 2 3 4 5* =% %)% % 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 ! " $# %# ) # %&# %( !" %%%+# %&# %*# %(+ %++ *(# %$# %,# %# %-+ % 1 2 3 4$+ ) %&. / % 5 6 7 8 * !" )% !" )% * 9 10 11 12
  • 15. 5% ) % 5 $ & 5. F %+) % %
  • 16. ) Data Reduct Computation Algorithm Reduct Remove attributes absent in reduct Reduced Training Data ID3 Algorithm DT
  • 17. Decision Tree CHLD y n0 HAGE young Very old old1 middle 0 1 LAND 0 1 1 0
  • 18. 103 % 7 & &) $) !, ) -5 $ #: % %) ;/ &* %
  • 19. )) 7) CC5 % % 65 % G 6 %G
  • 20. " + % $ ! + 5 " G ) G "Energy is used as a proxy for measuring food insecurity of thehousehold
  • 21. Morphological AttributesHouseHold_Id1. Land: Whether house has its own land2. Hedu: Highest education of the head3. Hage: age of the head in the household4. Chld: Whether children in the family5. Flsz: No of members in the family6. PrWm: Proportion of Women to Family Size7. Hstd: whether own home stead garden8. Pear: proportion of earning to family sizePCENER: Energy/Capita/day in terms of KCAL9. Decision: Derived from PCENER
  • 22. ) 55 $ #) % % ? ;:/H5 $ # % % % ? 1991" ) 5 # % % % ? 129/ % % #5! I1933 3 % % ! 1
  • 23. #$ %&! % % &$ % % % % &
  • 24. % * <) %% %
  • 25. )J KK %! !
  • 26. % * )Algorithm DescriptionRS Rough set with full discernibility decision relative reductCJU Continuous data, J4.8, unpruned DTCJP Continuous data, J4.8 algorithm, pruned DTDID3 RS based discretization, no reduct, ID3RDT RS based discretization, global reduct, ID3DJU Discretized using RS, J4.8, unprunedDJP Discretized using RS, J4.8, prunedRJU Discretized, global reducts, J4.8, unpruned DTRJP Discretized, global reducts, J4.8, pruned DTDRJU Discretized, dynamic reduct, J4.8, unpruned DTDRJP Discretized using RS, dynamic reduct, J4.8, pruned DT
  • 27. %
  • 28. ! !% 13 G ) ,- ) % ,- $ , - $ $ , - $ , - 1 1 1 1CS = ( A + + + ) 4 S Nr Na
  • 29. Evaluation of Simplified DT Accuracy =73% Complexity = 43 Number of rules = 9 Num. of attributes = 4 0 :poorest and vulnerable to food insecurity 1: not vulnerable to food insecurity
  • 30. Comparing Algorithms using CSId A S Nr Na CSRS 51 1003 149 6 .7 0 .1 7C JU 69 173 26 8 0 .2 1C JP 73 40 10 7 0 .2 5D ID 3 60 262 79 7 .3 0 .1 9RDT 59 269 82 6 .8 0 .1 9D JU 67 188 56 7 .1 0 .2 1D JP 73 43 16 4 .2 0 .2 6R JU 68 177 55 6 .4 0 .2 1R JP 72 43 17 4 .0 0 .2 7D R JU 67 186 56 6 .6 0 .2 1D R JP 73 43 9 4 .0 0 .2 8
  • 31. $ Accuracy Complexity 100.0 500 80.0 400 60.0 300 % 40.0 200 20.0 100 0.0 0 RS Rules Attributes CJU CJP150 8 DID120 RDT 6 DJU 90 4 DJP 60 RJU 2 30 RJP 0 0 DRJU DRJP
  • 32. DT(DRJP) - Nutrition DataAccuracy=73% CHLDComplexity=43 y Attributes=4 n Rules=9 0 HAGE <40 >51 40 [41,51) 1 0 1 FLSIZE <4 >4 4 1 1 PEAR <45 >45 [45,54) 1 1 0
  • 33. E ! % %$ % % E
  • 34. % C %* %) C % % C
  • 35. 1& ) #& & ;3 :& ) 3 6 5 $ & % 1H $ ) # & $ #5 &;& < #& . & ;3 1& 3 % C &:& < # &# # <& & ;3 1& 3 % & <5 &/& ( & & . ;3 :& 3 # 5 9 5 & # J 3 * :# $ ;H & :H9& ( & # & . ;3 9&3 &5 . < 5 # ;-1::61/H ;, &8& # & ;3 1& 4 3 6 J & 55. 18 :611&H& #& * ) & ;3 1& 3 1 ;# < # + 6G &0& L # & & 12 :& /&9 . 2 * & &2& # %> > & & & > M > > &13 J & # & <& 5 !" & ;3 33& * C . 5% #11& J # & 12 0 + . 2& % % &5 #& * # &# ) 1 ;# < #+ 6G /H ;6/2;&1;& 4 # 12 :& G J& 2 % #. % /8 :2 & 692