Machine-learning with
Accord.NET: Wine-quality
This example is based on the Wine Quality dataset from the
University of California Irvine Machine Learning Repository:
https://archive.ics.uci.edu/ml/datasets/Wine+Quality
Machine-Learning
• Current cloud providers (Microsoft, Amazon, Google, …)
have interest to sell computing power as API
• Machine-learning takes a lot of computing power
They have the interest to make it the next buzz-word.
They made cloud a buzz-word, they can do it again.
e.g. https://azure.microsoft.com/en-in/services/machine-learning/
However, this time we won’t use any APIs, but an open source tool called Accord.NET.
A case for machine learning
 We have some existing sample data.
 We want to estimate a variable known in the sample data,
but not in the real life.
 We expect that the real results will follow the sample data.
Randomize the sample data rows order and split it to two
parts:
1) Training set
• Used to find the correct model.
2) Model evaluation set
• Used to verify that the model works to data outside the trained samples
A sample dataset:
Wine quality
There is just a parameters of wines and a people
voted quality from 0 to 10:
https://archive.ics.uci.edu/ml/machine-learning-
databases/wine-quality/winequality-red.csv
Can we estimate a quality of non-listed wine based
on the features we know?
fixed acidity
volatile
acidity citric acid residual sugar chlorides
free sulfur
dioxide
total sulfur
dioxide density pH sulphates alcohol quality
7.4 0.7 0 1.9 0.076 11 34 0.9978 3.51 0.56 9.4 5
7.3 0.65 0 1.2 0.065 15 21 0.9946 3.39 0.47 10 7
(Linear) Regression
Creating a linear regression over
one feature is relatively simple.
y = k x + b
The dataset has a large amount of
wines, with different alcohol levels
and qualities.
But the dataset has 10 other
features also, so how to make a
regression over combined 11
variables? Takes forever…?
y = k1 x1 + k2 x2 + … + kn xn + b Original picture from: http://brandewinder.com/2016/08/06/gradient-boosting-part-1/
Accord .NET cancer example
Age Smokes Had cancer
55 0 FALSE
28 0 FALSE
65 1 FALSE
46 0 TRUE
86 1 TRUE
56 1 TRUE
85 0 FALSE
33 0 FALSE
21 1 FALSE
42 1 TRUE
Feature Odd ratio
Age 1.02
Smoking 5.86
Calculation y(x0, x1) =
0.0206451183100222*x0
+ 1.76788931343272*x1
+ -2.45774643623285
Decide()
http://fssnip.net/7Sz
Decision trees
Instead of combining slopes, create a combination
of feature-condition-stumps.
Estimating a few (discrete) categories based on
combination of decision nodes.
What method should I choose?
http://scikit-learn.org/stable/_static/ml_map.png
PH > 3.5
Alcohol > 10.6
Manual example and theory:
http://brandewinder.com/2016/08/06/gradient-boosting-part-1/
http://fssnip.net/7Tz
Figure has just 2 stumps, but real life AI can
generate huge trees.
Use-case: Quality for our event’s wine from Alko
https://www.alko.fi/tuotteet/455518/Frontera-Cabernet-Sauvignon-2016-hanapakkaus
Data from Alko analysis laboratory, wine entry L2BIBS34016:
In Finnish In English
Alk-% 12,01 Alcohol 12.01
Sokeri 3,5 g/l Sugar 3.5
Haihtuvat hapot 0,5 g/l Volatile acidity 0.5
Kokonaisrikki 96 mg/l Total sulfur 96
Vapaa rikki 36 mg/l Free sulfur 36
Sitruunahappo 0,045 g/l Citric acid 0.045
• This sample is from Chile and the sample data is from Italy, so our algorithm has to be able to
work outside the dataset.
• Parameter mismatch:
1) Convert parameters,
2) Remove parameter from learning process
 Measure the error, effect to model quality
We don’t have Mean
Fixed acidity 8.32 g/l
Chlorides 0.087 g/l
Density 0.9967 g/l
pH 3.31
Sulphates 0.66 g/l
Extra data Known
Total acids 4.62 g/l
Extract 29.7
Density “medium”
Cabernet Sauvignong
(Alko provided the data I asked by email)

Machine learning (using Accord.NET and FSharp)

  • 1.
    Machine-learning with Accord.NET: Wine-quality Thisexample is based on the Wine Quality dataset from the University of California Irvine Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Wine+Quality
  • 2.
    Machine-Learning • Current cloudproviders (Microsoft, Amazon, Google, …) have interest to sell computing power as API • Machine-learning takes a lot of computing power They have the interest to make it the next buzz-word. They made cloud a buzz-word, they can do it again. e.g. https://azure.microsoft.com/en-in/services/machine-learning/ However, this time we won’t use any APIs, but an open source tool called Accord.NET.
  • 3.
    A case formachine learning  We have some existing sample data.  We want to estimate a variable known in the sample data, but not in the real life.  We expect that the real results will follow the sample data. Randomize the sample data rows order and split it to two parts: 1) Training set • Used to find the correct model. 2) Model evaluation set • Used to verify that the model works to data outside the trained samples
  • 4.
    A sample dataset: Winequality There is just a parameters of wines and a people voted quality from 0 to 10: https://archive.ics.uci.edu/ml/machine-learning- databases/wine-quality/winequality-red.csv Can we estimate a quality of non-listed wine based on the features we know? fixed acidity volatile acidity citric acid residual sugar chlorides free sulfur dioxide total sulfur dioxide density pH sulphates alcohol quality 7.4 0.7 0 1.9 0.076 11 34 0.9978 3.51 0.56 9.4 5 7.3 0.65 0 1.2 0.065 15 21 0.9946 3.39 0.47 10 7
  • 5.
    (Linear) Regression Creating alinear regression over one feature is relatively simple. y = k x + b The dataset has a large amount of wines, with different alcohol levels and qualities. But the dataset has 10 other features also, so how to make a regression over combined 11 variables? Takes forever…? y = k1 x1 + k2 x2 + … + kn xn + b Original picture from: http://brandewinder.com/2016/08/06/gradient-boosting-part-1/
  • 6.
    Accord .NET cancerexample Age Smokes Had cancer 55 0 FALSE 28 0 FALSE 65 1 FALSE 46 0 TRUE 86 1 TRUE 56 1 TRUE 85 0 FALSE 33 0 FALSE 21 1 FALSE 42 1 TRUE Feature Odd ratio Age 1.02 Smoking 5.86 Calculation y(x0, x1) = 0.0206451183100222*x0 + 1.76788931343272*x1 + -2.45774643623285 Decide() http://fssnip.net/7Sz
  • 7.
    Decision trees Instead ofcombining slopes, create a combination of feature-condition-stumps. Estimating a few (discrete) categories based on combination of decision nodes. What method should I choose? http://scikit-learn.org/stable/_static/ml_map.png PH > 3.5 Alcohol > 10.6 Manual example and theory: http://brandewinder.com/2016/08/06/gradient-boosting-part-1/ http://fssnip.net/7Tz Figure has just 2 stumps, but real life AI can generate huge trees.
  • 8.
    Use-case: Quality forour event’s wine from Alko https://www.alko.fi/tuotteet/455518/Frontera-Cabernet-Sauvignon-2016-hanapakkaus Data from Alko analysis laboratory, wine entry L2BIBS34016: In Finnish In English Alk-% 12,01 Alcohol 12.01 Sokeri 3,5 g/l Sugar 3.5 Haihtuvat hapot 0,5 g/l Volatile acidity 0.5 Kokonaisrikki 96 mg/l Total sulfur 96 Vapaa rikki 36 mg/l Free sulfur 36 Sitruunahappo 0,045 g/l Citric acid 0.045 • This sample is from Chile and the sample data is from Italy, so our algorithm has to be able to work outside the dataset. • Parameter mismatch: 1) Convert parameters, 2) Remove parameter from learning process  Measure the error, effect to model quality We don’t have Mean Fixed acidity 8.32 g/l Chlorides 0.087 g/l Density 0.9967 g/l pH 3.31 Sulphates 0.66 g/l Extra data Known Total acids 4.62 g/l Extract 29.7 Density “medium” Cabernet Sauvignong (Alko provided the data I asked by email)