SlideShare a Scribd company logo
1 of 19
GROUP 4
--by using mathematical models
Investigating The Best Quality
Measurements
Student ID: 9685718 9744049 8455558 9794830 8517199 9652148
- Decision for procurement of
Portuguese white wine
- Decision made based on
wine’s characteristics
- Classification methods were
used
Aim:Find the best quality wines
using the most accurate method
Methods used:
- Decision Trees
- Logical regression
- Multiple regression
- KNN algorithm
Introduction
01 Initial Trials
-- Decision Tree, Logical Regression & Multiple Regression
Step/01 Decision Tree
Built using “python” software
Model runs 10 times
Accuracy below 60%
Step/01 Logical Regression
Performed using Matlab
Performed 5 times
Accuracy below 55%
Step/01 Multiple Regression
Performed using Excel
Removed variables with low p-value
Accuracy below 50%
None of the three methods mentioned
above was considered as a reliable
method to use for classification.
02 Basic codes interpretation
-- k-NN by Matlab language
function [averagewrong,precision] = main(repeattimes)]
[alldata,txt,raw]=xlsread('wine-quality');
len = size(alldata,1); Set the sample data
sample_number = round(0.8*len);
test_number = len - sample_number; Set the rest of data as test data
for k=1:10
wrong = [];
for t=1:repeattimes
sample = [];
test = [];
sequence = randperm(len); Generate random permutation
for i=1:len
if i<=sample_number Pick random samples from data
sample = [sample;alldata(sequence(i),:)];
test = [test;alldata(sequence(i),:)]; Rest of data as test data
end
end
Algorithms Design of Main Function
wrongnumber = classify(sample,test,k); KNN Meothod Applied
wrong = [wrong wrongnumber];
end
averagewrong(k)= mean(wrong);
end
plot(averagewrong,'k-*');
title(['Average errors from k=1 to k=10,while repeat times is
',num2str(repeattimes)]);
xlabel('k');ylabel('errors number')
precision = 1-averagewrong/test_number;
plot(precision,'r-*');
title(['Average precision from k=1 to k=10,while repeat times is
',num2str(repeattimes)]);
xlabel('k');ylabel('precision')
Algorithms Design of Main Function
Draw the plot
Algorithms Design of Classification Function
function [wrongnumber] = classify(traindata,testdata,k)
test_len = size(testdata,1);
train_len = size(traindata,1);
predict = [];
wrongnumber = 0;
for i=1:test_len
temp=[];
test = repmat(testdata(i,:),train_len,1);
temp = sum((test-traindata).^2,2).^0.5; Calculate the distance between test data and sample
temp = [temp traindata(:,end)]; First column is distance, Second column is quality
temp = sortrows(temp,1); Do ranking based on the distance
tt = ones(9,2); Programming K closest qualities
tt(:,1)=cumsum(tt(:,1));
for w=1:k
quality = temp(w,2);
tt(quality,2) = tt(quality,2)+1;
end
tt = sortrows(tt,-2); Rank the quality based on how many time the quality appears
predict(i) = tt(1,1); Choose the one that appears the most
if predict(i)~=testdata(i,end)
wrongnumber = wrongnumber+1;
end
end
03 Results Analysis
--by Matlab software
Trial>> [averagewrong,precision] = main(50)
averagewrong =
349.2400 432.3200 452.4800 464.0400 474.8000 478.1600
475.6800 483.8600 480.1200
precision =
0.6436 0.5589 0.5383 0.5326 0.5265 0.5155 0.5121 0.5146
0.5663 0.5101
Computational Results
Step 1: Basic Model
K=1
v
Single Variable Accuracy (when k=1)
Variable Name Accuracy
Fixed Acidity 99.95%
Volatile Acidity 100.00%
Citric Acid 100.00%
Residual Sugar 99.86%
Chlorides 100.00%
Free Sulfur Dioxide 98.59%
Total Sulfur Dioxide 96.68%
Density 100.00%
PH 100.00%
Sulphates 100.00%
Alcohol 99.99% (100%)
Step 2: Remove variables
Delete
Computational Results
>> [averagewrong,precision] = main(50)
averagewrong =
0.1800 0.4200 0.7800 0.7400 1.3000 1.5000
2.1800 2.1200 2.0400 2.2800
precision =
0.9998 0.9996 0.9992 0.9992 0.9987 0.9985
0.9978 0.9978 0.9979 0.9977
Step 2 Remove variables
K=1
Step 3 Normalization
Step 3 Normalization
Normalization is the transformation from a normally distributed random
variable to a random variable following a standard normal distribution.
.
Reference: The concise encyclopedia
of statistics, PP 387-388
*Code of Normalization:
for i=1:size(alldata,2)-1
alldata(:,i) = (max(alldata(:,i))- alldata(:,i)) /
(max(alldata(:,i))-min(alldata(:,i)));
end
Formula: X=
𝑥−𝑥 𝑚𝑖𝑛
𝑥 𝑚𝑎𝑥−𝑥 𝑚𝑖𝑛
∈ (0,1)
Computational Results
>> [averagewrong,precision] = main(50)
averagewrong =
0.0050 0.0240 0.0120 0.1760 0.1970 0.6160
0.5840 0.9420 0.9920 0.9790
precision =
1.0000 1.0000 1.0000 0.9998 0.9998 0.9994
0.9994 0.9990 0.9990 0.9990
Step 3 Normalization
K=1
>> [averagewrong,precision] =
main(1000)
averagewrong =
0 0.0200 0.0200 0.1300
0.1590 0.6000 0.5830 0.9800
0.9830 1.0340
precision =
1.0000 1.0000 1.0000 0.9999
0.9998 0.9994 0.9994 0.9990
0.9990 0.9989
Step 4 Combination
K=1
谢谢观赏
Group 4
THANK YOU!
THANKS

More Related Content

What's hot

Dynamic programming class 16
Dynamic programming class 16Dynamic programming class 16
Dynamic programming class 16Kumar
 
Data Structures and Algorithms
Data Structures and AlgorithmsData Structures and Algorithms
Data Structures and AlgorithmsRAHUL957367
 
Introduction to dynamic programming
Introduction to dynamic programmingIntroduction to dynamic programming
Introduction to dynamic programmingAmisha Narsingani
 
Backtracking & branch and bound
Backtracking & branch and boundBacktracking & branch and bound
Backtracking & branch and boundVipul Chauhan
 
Algorithms and flowcharts ppt (seminar presentation)..
 Algorithms and flowcharts  ppt (seminar presentation).. Algorithms and flowcharts  ppt (seminar presentation)..
Algorithms and flowcharts ppt (seminar presentation)..Nagendra N
 
Dynamic programming prasintation eaisy
Dynamic programming prasintation eaisyDynamic programming prasintation eaisy
Dynamic programming prasintation eaisyahmed51236
 
Daa unit 6_efficiency of algorithms
Daa unit 6_efficiency of algorithmsDaa unit 6_efficiency of algorithms
Daa unit 6_efficiency of algorithmssnehajiyani
 
daa-unit-3-greedy method
daa-unit-3-greedy methoddaa-unit-3-greedy method
daa-unit-3-greedy methodhodcsencet
 
Dynamic Programming - Part 1
Dynamic Programming - Part 1Dynamic Programming - Part 1
Dynamic Programming - Part 1Amrinder Arora
 
8.1 alogorithm & prolem solving
8.1 alogorithm & prolem solving8.1 alogorithm & prolem solving
8.1 alogorithm & prolem solvingKhan Yousafzai
 
Agile Testing Alliance Chapter presentation - Equivalence Partition and Bound...
Agile Testing Alliance Chapter presentation - Equivalence Partition and Bound...Agile Testing Alliance Chapter presentation - Equivalence Partition and Bound...
Agile Testing Alliance Chapter presentation - Equivalence Partition and Bound...Agile Testing alliance
 
CS8461 - Design and Analysis of Algorithms
CS8461 - Design and Analysis of AlgorithmsCS8461 - Design and Analysis of Algorithms
CS8461 - Design and Analysis of AlgorithmsKrishnan MuthuManickam
 
Dynamic programming in Algorithm Analysis
Dynamic programming in Algorithm AnalysisDynamic programming in Algorithm Analysis
Dynamic programming in Algorithm AnalysisRajendran
 

What's hot (20)

Dynamic pgmming
Dynamic pgmmingDynamic pgmming
Dynamic pgmming
 
Dynamic programming class 16
Dynamic programming class 16Dynamic programming class 16
Dynamic programming class 16
 
Daa unit 2
Daa unit 2Daa unit 2
Daa unit 2
 
Data Structures and Algorithms
Data Structures and AlgorithmsData Structures and Algorithms
Data Structures and Algorithms
 
Introduction to dynamic programming
Introduction to dynamic programmingIntroduction to dynamic programming
Introduction to dynamic programming
 
Backtracking & branch and bound
Backtracking & branch and boundBacktracking & branch and bound
Backtracking & branch and bound
 
Algorithms and flowcharts ppt (seminar presentation)..
 Algorithms and flowcharts  ppt (seminar presentation).. Algorithms and flowcharts  ppt (seminar presentation)..
Algorithms and flowcharts ppt (seminar presentation)..
 
Dynamic programming prasintation eaisy
Dynamic programming prasintation eaisyDynamic programming prasintation eaisy
Dynamic programming prasintation eaisy
 
Daa unit 6_efficiency of algorithms
Daa unit 6_efficiency of algorithmsDaa unit 6_efficiency of algorithms
Daa unit 6_efficiency of algorithms
 
daa-unit-3-greedy method
daa-unit-3-greedy methoddaa-unit-3-greedy method
daa-unit-3-greedy method
 
chapter 1
chapter 1chapter 1
chapter 1
 
Dynamic programming
Dynamic programmingDynamic programming
Dynamic programming
 
Dynamic Programming - Part 1
Dynamic Programming - Part 1Dynamic Programming - Part 1
Dynamic Programming - Part 1
 
8.1 alogorithm & prolem solving
8.1 alogorithm & prolem solving8.1 alogorithm & prolem solving
8.1 alogorithm & prolem solving
 
algorithm Unit 2
algorithm Unit 2 algorithm Unit 2
algorithm Unit 2
 
Daa unit 2
Daa unit 2Daa unit 2
Daa unit 2
 
Agile Testing Alliance Chapter presentation - Equivalence Partition and Bound...
Agile Testing Alliance Chapter presentation - Equivalence Partition and Bound...Agile Testing Alliance Chapter presentation - Equivalence Partition and Bound...
Agile Testing Alliance Chapter presentation - Equivalence Partition and Bound...
 
Dynamic programming
Dynamic programmingDynamic programming
Dynamic programming
 
CS8461 - Design and Analysis of Algorithms
CS8461 - Design and Analysis of AlgorithmsCS8461 - Design and Analysis of Algorithms
CS8461 - Design and Analysis of Algorithms
 
Dynamic programming in Algorithm Analysis
Dynamic programming in Algorithm AnalysisDynamic programming in Algorithm Analysis
Dynamic programming in Algorithm Analysis
 

Viewers also liked (11)

Img 5103 (4 files merged)
Img 5103 (4 files merged)Img 5103 (4 files merged)
Img 5103 (4 files merged)
 
Play Group English (A-C)
Play Group English (A-C)Play Group English (A-C)
Play Group English (A-C)
 
intel-soa-v2704-sec-eng
intel-soa-v2704-sec-engintel-soa-v2704-sec-eng
intel-soa-v2704-sec-eng
 
7.1.1
7.1.17.1.1
7.1.1
 
2015 nepal earthquake
2015 nepal earthquake2015 nepal earthquake
2015 nepal earthquake
 
Dual proof e 2016
Dual proof e 2016Dual proof e 2016
Dual proof e 2016
 
Calculo de propociones maria rivero
Calculo de propociones maria riveroCalculo de propociones maria rivero
Calculo de propociones maria rivero
 
Unit 2 vocabulary study #1
Unit 2 vocabulary study #1Unit 2 vocabulary study #1
Unit 2 vocabulary study #1
 
Osapotemmedo 130222160725-phpapp01
Osapotemmedo 130222160725-phpapp01Osapotemmedo 130222160725-phpapp01
Osapotemmedo 130222160725-phpapp01
 
Zone IDA Proc
Zone IDA ProcZone IDA Proc
Zone IDA Proc
 
3.1 voc
3.1 voc3.1 voc
3.1 voc
 

Similar to 改4 (1)

Sw metrics for regression testing
Sw metrics for regression testingSw metrics for regression testing
Sw metrics for regression testingJyotsna Sharma
 
Comparison of Top Data Mining(Final)
Comparison of Top Data Mining(Final)Comparison of Top Data Mining(Final)
Comparison of Top Data Mining(Final)Sanghun Kim
 
Introduction to design and analysis of algorithm
Introduction to design and analysis of algorithmIntroduction to design and analysis of algorithm
Introduction to design and analysis of algorithmDevaKumari Vijay
 
Novel algorithms for Knowledge discovery from neural networks in Classificat...
Novel algorithms for  Knowledge discovery from neural networks in Classificat...Novel algorithms for  Knowledge discovery from neural networks in Classificat...
Novel algorithms for Knowledge discovery from neural networks in Classificat...Dr.(Mrs).Gethsiyal Augasta
 
CI L11 Optimization 3 GlobalOptimization.pdf
CI L11 Optimization 3 GlobalOptimization.pdfCI L11 Optimization 3 GlobalOptimization.pdf
CI L11 Optimization 3 GlobalOptimization.pdfSantiagoGarridoBulln
 
CS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of AlgorithmsCS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of AlgorithmsKrishnan MuthuManickam
 
Analysis of algorithms
Analysis of algorithmsAnalysis of algorithms
Analysis of algorithmsGanesh Solanke
 
Multi criteria decision making
Multi criteria decision makingMulti criteria decision making
Multi criteria decision makingKartik Bansal
 
關於測試,我說的其實是......
關於測試,我說的其實是......關於測試,我說的其實是......
關於測試,我說的其實是......hugo lu
 
Parallel Algorithms K – means Clustering
Parallel Algorithms K – means ClusteringParallel Algorithms K – means Clustering
Parallel Algorithms K – means ClusteringAndreina Uzcategui
 
Process Capability for certificate course for marketing engineers online
Process Capability for certificate course for marketing engineers onlineProcess Capability for certificate course for marketing engineers online
Process Capability for certificate course for marketing engineers onlineDevendraLokhande
 
Enhanced technique for regression testing
Enhanced technique for regression testingEnhanced technique for regression testing
Enhanced technique for regression testingeSAT Journals
 
White boxvsblackbox
White boxvsblackboxWhite boxvsblackbox
White boxvsblackboxsanerjjd
 
Covering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmCovering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmZHAO Sam
 
01 Introduction to analysis of Algorithms.pptx
01 Introduction to analysis of Algorithms.pptx01 Introduction to analysis of Algorithms.pptx
01 Introduction to analysis of Algorithms.pptxssuser586772
 

Similar to 改4 (1) (20)

Sw metrics for regression testing
Sw metrics for regression testingSw metrics for regression testing
Sw metrics for regression testing
 
Comparison of Top Data Mining(Final)
Comparison of Top Data Mining(Final)Comparison of Top Data Mining(Final)
Comparison of Top Data Mining(Final)
 
UNIT-2-PPTS-DAA.ppt
UNIT-2-PPTS-DAA.pptUNIT-2-PPTS-DAA.ppt
UNIT-2-PPTS-DAA.ppt
 
Introduction to design and analysis of algorithm
Introduction to design and analysis of algorithmIntroduction to design and analysis of algorithm
Introduction to design and analysis of algorithm
 
Novel algorithms for Knowledge discovery from neural networks in Classificat...
Novel algorithms for  Knowledge discovery from neural networks in Classificat...Novel algorithms for  Knowledge discovery from neural networks in Classificat...
Novel algorithms for Knowledge discovery from neural networks in Classificat...
 
CI L11 Optimization 3 GlobalOptimization.pdf
CI L11 Optimization 3 GlobalOptimization.pdfCI L11 Optimization 3 GlobalOptimization.pdf
CI L11 Optimization 3 GlobalOptimization.pdf
 
CS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of AlgorithmsCS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of Algorithms
 
Analysis of algorithms
Analysis of algorithmsAnalysis of algorithms
Analysis of algorithms
 
Multi criteria decision making
Multi criteria decision makingMulti criteria decision making
Multi criteria decision making
 
關於測試,我說的其實是......
關於測試,我說的其實是......關於測試,我說的其實是......
關於測試,我說的其實是......
 
Parallel Algorithms K – means Clustering
Parallel Algorithms K – means ClusteringParallel Algorithms K – means Clustering
Parallel Algorithms K – means Clustering
 
Cis435 week01
Cis435 week01Cis435 week01
Cis435 week01
 
Process Capability for certificate course for marketing engineers online
Process Capability for certificate course for marketing engineers onlineProcess Capability for certificate course for marketing engineers online
Process Capability for certificate course for marketing engineers online
 
eam2
eam2eam2
eam2
 
Enhanced technique for regression testing
Enhanced technique for regression testingEnhanced technique for regression testing
Enhanced technique for regression testing
 
White boxvsblackbox
White boxvsblackboxWhite boxvsblackbox
White boxvsblackbox
 
BIRTE-13-Kawashima
BIRTE-13-KawashimaBIRTE-13-Kawashima
BIRTE-13-Kawashima
 
Covering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmCovering (Rules-based) Algorithm
Covering (Rules-based) Algorithm
 
Algorithms
Algorithms Algorithms
Algorithms
 
01 Introduction to analysis of Algorithms.pptx
01 Introduction to analysis of Algorithms.pptx01 Introduction to analysis of Algorithms.pptx
01 Introduction to analysis of Algorithms.pptx
 

改4 (1)

  • 1. GROUP 4 --by using mathematical models Investigating The Best Quality Measurements Student ID: 9685718 9744049 8455558 9794830 8517199 9652148
  • 2. - Decision for procurement of Portuguese white wine - Decision made based on wine’s characteristics - Classification methods were used Aim:Find the best quality wines using the most accurate method Methods used: - Decision Trees - Logical regression - Multiple regression - KNN algorithm Introduction
  • 3. 01 Initial Trials -- Decision Tree, Logical Regression & Multiple Regression
  • 4. Step/01 Decision Tree Built using “python” software Model runs 10 times Accuracy below 60%
  • 5. Step/01 Logical Regression Performed using Matlab Performed 5 times Accuracy below 55%
  • 6. Step/01 Multiple Regression Performed using Excel Removed variables with low p-value Accuracy below 50% None of the three methods mentioned above was considered as a reliable method to use for classification.
  • 7. 02 Basic codes interpretation -- k-NN by Matlab language
  • 8. function [averagewrong,precision] = main(repeattimes)] [alldata,txt,raw]=xlsread('wine-quality'); len = size(alldata,1); Set the sample data sample_number = round(0.8*len); test_number = len - sample_number; Set the rest of data as test data for k=1:10 wrong = []; for t=1:repeattimes sample = []; test = []; sequence = randperm(len); Generate random permutation for i=1:len if i<=sample_number Pick random samples from data sample = [sample;alldata(sequence(i),:)]; test = [test;alldata(sequence(i),:)]; Rest of data as test data end end Algorithms Design of Main Function
  • 9. wrongnumber = classify(sample,test,k); KNN Meothod Applied wrong = [wrong wrongnumber]; end averagewrong(k)= mean(wrong); end plot(averagewrong,'k-*'); title(['Average errors from k=1 to k=10,while repeat times is ',num2str(repeattimes)]); xlabel('k');ylabel('errors number') precision = 1-averagewrong/test_number; plot(precision,'r-*'); title(['Average precision from k=1 to k=10,while repeat times is ',num2str(repeattimes)]); xlabel('k');ylabel('precision') Algorithms Design of Main Function Draw the plot
  • 10. Algorithms Design of Classification Function function [wrongnumber] = classify(traindata,testdata,k) test_len = size(testdata,1); train_len = size(traindata,1); predict = []; wrongnumber = 0; for i=1:test_len temp=[]; test = repmat(testdata(i,:),train_len,1); temp = sum((test-traindata).^2,2).^0.5; Calculate the distance between test data and sample temp = [temp traindata(:,end)]; First column is distance, Second column is quality temp = sortrows(temp,1); Do ranking based on the distance tt = ones(9,2); Programming K closest qualities tt(:,1)=cumsum(tt(:,1)); for w=1:k quality = temp(w,2); tt(quality,2) = tt(quality,2)+1; end tt = sortrows(tt,-2); Rank the quality based on how many time the quality appears predict(i) = tt(1,1); Choose the one that appears the most if predict(i)~=testdata(i,end) wrongnumber = wrongnumber+1; end end
  • 11. 03 Results Analysis --by Matlab software
  • 12. Trial>> [averagewrong,precision] = main(50) averagewrong = 349.2400 432.3200 452.4800 464.0400 474.8000 478.1600 475.6800 483.8600 480.1200 precision = 0.6436 0.5589 0.5383 0.5326 0.5265 0.5155 0.5121 0.5146 0.5663 0.5101 Computational Results Step 1: Basic Model K=1
  • 13. v Single Variable Accuracy (when k=1) Variable Name Accuracy Fixed Acidity 99.95% Volatile Acidity 100.00% Citric Acid 100.00% Residual Sugar 99.86% Chlorides 100.00% Free Sulfur Dioxide 98.59% Total Sulfur Dioxide 96.68% Density 100.00% PH 100.00% Sulphates 100.00% Alcohol 99.99% (100%) Step 2: Remove variables Delete
  • 14. Computational Results >> [averagewrong,precision] = main(50) averagewrong = 0.1800 0.4200 0.7800 0.7400 1.3000 1.5000 2.1800 2.1200 2.0400 2.2800 precision = 0.9998 0.9996 0.9992 0.9992 0.9987 0.9985 0.9978 0.9978 0.9979 0.9977 Step 2 Remove variables K=1
  • 16. Step 3 Normalization Normalization is the transformation from a normally distributed random variable to a random variable following a standard normal distribution. . Reference: The concise encyclopedia of statistics, PP 387-388 *Code of Normalization: for i=1:size(alldata,2)-1 alldata(:,i) = (max(alldata(:,i))- alldata(:,i)) / (max(alldata(:,i))-min(alldata(:,i))); end Formula: X= 𝑥−𝑥 𝑚𝑖𝑛 𝑥 𝑚𝑎𝑥−𝑥 𝑚𝑖𝑛 ∈ (0,1)
  • 17. Computational Results >> [averagewrong,precision] = main(50) averagewrong = 0.0050 0.0240 0.0120 0.1760 0.1970 0.6160 0.5840 0.9420 0.9920 0.9790 precision = 1.0000 1.0000 1.0000 0.9998 0.9998 0.9994 0.9994 0.9990 0.9990 0.9990 Step 3 Normalization K=1
  • 18. >> [averagewrong,precision] = main(1000) averagewrong = 0 0.0200 0.0200 0.1300 0.1590 0.6000 0.5830 0.9800 0.9830 1.0340 precision = 1.0000 1.0000 1.0000 0.9999 0.9998 0.9994 0.9994 0.9990 0.9990 0.9989 Step 4 Combination K=1