Data mining using matlab codes
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Data mining using matlab codes

on

  • 1,659 views

how to use matlab and weka in Data mining

how to use matlab and weka in Data mining

Statistics

Views

Total Views
1,659
Views on SlideShare
1,649
Embed Views
10

Actions

Likes
1
Downloads
33
Comments
0

1 Embed 10

http://www.slideee.com 10

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Data mining using matlab codes Presentation Transcript

  • 1. By Ahmad karawash DATA MINING USING MATLAB CODES 1
  • 2. overview  Network  Data used  Create the graph  Display graph  Learning parameter  Inference  conclusion 2
  • 3. Network 3
  • 4. Data used  Use asia10000.mat file that contain 10000 records about Chest Clinic. 4
  • 5. Create graph  N=8;  dag=zeros(N,N);  A=1;S=2;T=3;L=4;B=5;E=6;X=7;D=8;  dag(A,T)=1;  dag(S,[L B])=1;  dag(T,E)=1;  dag(L,E)=1;  dag(E,[X D])=1;  dag(S,B)=1;  dag(B,D)=1;  discrete_nodes=1:N;  node_sizes=[2 2 2 2 2 2 2 2];  bnet=mk_bnet(dag,node_sizes,discrete_nodes); 5
  • 6. Display graph  names = {'VisitToAsia', 'Smoker', 'HasTuberCulosis', 'HasLungCancer', 'HasBronchitis', 'TuberculosisOrCancer', 'PositiveX-Ray', 'Dyspnoea'};  carre_rond = [1 1 1 1 1 1 1 1];  draw_graph(bnet.dag,names,carre_rond);  title('medical domain'); 6
  • 7. Learning parameter  load asia10000.mat;  nsamples = size('asia10000',1);   bnet.CPD{E}=tabular_CPD(bnet,E);  bnet.CPD{T}=tabular_CPD(bnet,T);  bnet.CPD{L}=tabular_CPD(bnet,L);  bnet.CPD{S}=tabular_CPD(bnet,S);  bnet.CPD{A}=tabular_CPD(bnet,A);  bnet.CPD{D}=tabular_CPD(bnet,D);  bnet.CPD{B}=tabular_CPD(bnet,B);  bnet.CPD{X}=tabular_CPD(bnet,X);  bnet=learn_params(bnet,'asia10000'); 7
  • 8. S A Load CPT T   L CPT = cell(1,N); B for i=1:N  s=struct(bnet.CPD{i});  CPT{i}=s.CPT;   E End celldisp(CPT) D X 8
  • 9. Inference (via Mathlab code)  engine=jtree_inf_engine(bnet);  evidence=cell(1,N);  evidence{T}=1; % E=false => has no tuberclosis  evidence{L}=2; % => has lung cancer  evidence{B}=1; % => has no branchit  [engine,loglik]=enter_evidence(engine,evidence);  marg=marginal_nodes(engine,A);  % Displaying the result of inference  fprintf('nResult of the inferencen');  fprintf('P(E / T=2, L=1 ,B=1) = [%3.5f %3.5f]n',marg.T)  Result of the inference  P(E / T=2, L=1, B=1 ) = [1.0000 0.0000] -> 1 > 0  => P(E/ B=1, T=2,L=1)= true (normally true result if T or L =>E) then we can make classification 9
  • 10. conclusion  Now we can make probability (any thing/ anything) 10
  • 11. Weka overview  Used data  Decision tree  Bayes Naif Classifier  K-mean clustering 11
  • 12. Used data  For classification I will use arff file about Diabetes. For clustering I will use arff file bmw-training.arff 12
  • 13. Decision tree build 13
  • 14. Decision tree build Making a classification using decision Tree result of correct classification is ~84% And of incorrect classification is ~ 15% 14
  • 15. Decision tree draw 15
  • 16. BNC build 16
  • 17. BNC build Making a classification using decision Tree result of correct classification is ~76% And of incorrect classification is ~ 23% 17
  • 18. Compare DT & BNC BNC The incorrect classified instance by BNC is greater than that of DT DT 18
  • 19. K-mean cluster 19
  • 20. K-mean cluster  Interpretation of the result will be discussed  We divide cluster to 2 and 500 iteration 20
  • 21. By Ahmad Karawash PhD, Canada. For more information:  Ahmad.karawash@gmail.com 21